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TITLE OF THE INVENTION 

RECEPTOR FUNCTION ASSAY FOR G-PROTEIN COUPLED RECEPTORS AND 
ORPHAN RECEPTORS BY REPORTER ENZYME MUTANT COMPLEMENTATION 

BACKGROUND OF THE INVENTION 

This application claims the benefit from Provisional Application Serial No, 
60/180,669, filed February 7, 2000, The entirety of that provisional application is 
incorporated herein by reference. 
Field of the Invention 

This invention relates to methods of detecting G-protein-coupled receptor (GPCR) 
activity, and provides methods of assaying GPCR activity and methods for screening for 
GPCR ligands, G-protein-coupled receptor kinase (GRK) activity, and compounds that 
interact with components of the GPCR regulatory process. 

The actions of many extracellular signals are mediated by the interaction of G-protein- 
coupled receptors (GPCRs) and guanine nucleotide-binding regulatory proteins (G-proteins). 
G-protein-mediated signaling systems have been identified in many divergent organisms, 
such as mammals and yeast. The GPCRs represent a large super family of proteins which 
have divergent amino acid sequences, but share common structural features, in particular, the 
presence of seven transmembrane helical domains. GPCRs respond to, among other 
extracellular signals, neurotransmitters, hormones, odorants and light. Individual GPCR 
types activate a particular signal transduction pathway; at least ten different signal 
transduction pathways are known to be activated via GPCRs. For example, the beta 2- 
adrenergic receptor (P2AR) is a prototype mammalian GPCR. In response to agonist binding, 
(32AR receptors activate a G-protein (Gs) which in turn stimulates adenylate cyclase activity 
and results in increased cyclic adenosine monophosphate (cAMP) production in the cell. 



The signaling pathway and final cellular response that result from GPCR stimulation 
depends on the specific class of G-protein with which the particular receptor is coupled 
(Hamm, "The many faces of G-Protein Signaling." J. Biol. Chem., 273:669-672 (1998)). For 
instance, coupling to the Gs class of G-proteins stimulates cAMP production and activation 
5 of Protein Kinase A and C pathways, whereas coupling to the Gi class of G-proteins down 
regulates cAMP. Other second messenger systems as calcium, phosphlipase C, and 
phosphatidylinositol 3 may also be utilized. As a consequence, GPCR signaling events have 
predominantly been measured via quantification of these second messenger products. 
% A common feature of GPCR physiology is desensitization and recycling of the 

I V receptor through the processes of receptor phosphorylation, endocytosis and 
yj dephosphorylation (Ferguson, et al . "G-protein-coupled receptor regulation: role of G- 
^ protein-coupled receptor kinases and arrestins " Can. J. Physiol Pharmacol, 74:1095-1 1 10 

(1996)). Ligand-occupied GPCRs can be phosphorylated by two families of serine/threonine 
□ kinases, the G-protein-coupled receptor kinases (GRKs) and the second messenger-dependent 
15 protein kinases such as protein kinase A and protein kinase C. Phosphorylation by either 

class of kinases serves to down-regulate the receptor by uncoupling it from its corresponding 
G-protein. GRK-phosphorylation also serves to down-regulate the receptor by recruitment of 
a class of proteins known as the arrestins that bind the cytoplasmic domain of the receptor 
and promote clustering of the receptor into endocytic vescicles. Once the receptor is 
20 endocytosed, it will either be degraded in lysosomes or dephosphorylated and recycled back 
to the plasma membrane as fully-functional receptor. 

Binding of an arrestin protein to an activated receptor has been documented as a 
common phenomenon for a variety of GPCRs ranging from rhodopsin to (32AR to the 
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neurotensin receptor ( Barak, et al„ "A p-arrestin/Green Fluorescent fusion protein biosensor 
for detecting G-Protein-Coupled Receptor Activation," J. Biol. Chem., 272:27497-500 
(1997)). Consequently, monitoring arrestin interaction with a specific GPCR can be utilized 
as a generic tool for measuring GPCR activation. Similarly, a single G-protein and GRK also 
partner with a variety of receptors (Hamm, et al. (1998) and Pitcher et al. , "G-Protein- 
Coupled Receptor Kinases," Annu. Rev. Biochem., 67:653-92 (1998)), such that these 
protein/protein interactions may also be monitored to determine receptor activity. 

The present invention involves the use of a proprietary technology (ICAST™, 
Intercistronic Complementation Analysis Screening Technology) for monitoring 
protein/protein interactions in GPCR signaling. The method involves using two inactive p- 
galactosidase mutants, each of which is fused with one of two interacting protein pairs, such 
as a GPCR and an arrestin. The formation of an active P-galactosidase complex is driven by 
interaction of the target proteins. In this system, p-galactosidase activity acts as a read out of 
GPCR activity. FIGURE 23 is a schematic depicting the method of the present invention. 
FIGURE 23 shows two inactive mutants that become active when they interact. In addition, 
this technology could be used to monitor GPCR-mediated signaling pathways via other 
downstream signaling components such as G-proteins, GRKs or c-Src. 

Many therapeutic drugs in use today target GPCRs, as they regulate vital 
physiological responses, including vasodilation, heart rate, bronchodilation, endocrine 
secretion and gut peristalsis. See, e^, Lefkowitz et al. . Annu. Rev. Biochem., 52:159 (1983). 
For instance, drugs targeting the highly studied GPCR, p2AR are used in the treatment of 
anaphylaxis, shock hypertension, asthma and other conditions. Some of these drugs mimic 



the ligand for this receptor. Other drugs act to antagonize the receptor in cases when disease 
arises from spontaneous activity of the receptor. 

Efforts such as the Human Genome Project are identifying new GPCRs ("orphan" 
receptors) whose physiological roles and ligands are unknown. It is estimated that several 
5 thousand GPCRs exist in the human genome. Of the 250 GPCRs identified to date, only 150 
have been associated with ligands. 

n SUMMARY OF THE INVENTION 

m A first aspect of the present invention is a method that monitors GPCR function 

^ proximally at the site of receptor activation, thus providing more information for drug 
lft: discovery purposes due to fewer competing mechanisms. Activation of the GPCR is 
Pi measured by a read-out for interaction of the receptor with a regulatory component such as 
O arrestin, G-protein, GRK or other kinases, the binding of which to the receptor is dependent 
upon agonist occupation of the receptor. Protein/protein interaction is detected by 
complementation of reporter proteins such as utilized by the ICAST™ technology. 
15 A further aspect of the present invention is a method of assessing G-protein-coupled 

receptor (GPCR) pathway activity under test conditions by providing a test cell that expresses 
a GPCR, ^g., muscarinic, adrenergic, dopamine, angiotensin or endothelin, as a fusion 
protein to a mutant reporter protein and interacting, he., G-proteins, arrestin or GRK, as a 
fusion protein with a complementing reporter protein. When test cells are exposed to a 
20 known agonist to the target GPCR under test conditions, activation of the GPCR will be 

monitored by complementation of the reporter enzyme. Increased reporter enzyme activity 
reflects interaction of the GPCR with its interacting protein partner. 
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A further aspect of the present invention is a method of assessing GPCR pathway 
activity in the presence of a test kinase. 

A further aspect of the present invention is a method of assessing GPCR pathway 
activity in the presence of a test G-protein. 
5 A further aspect of the present invention is a method of assessing GPCR pathway 

activity upon exposure of the test cell to a test ligand. 

A further aspect of the present invention is a method of assessing GPCR pathway 
O activity upon co-expression in the test cell of a second receptor. 

y ^ A further aspect of the present invention is a method for screening for a ligand or 

10(-; agonists to an orphan GPCR. The ligand or agonist could be contained in natural or synthetic 
libraries or mixtures or could be a physical stimulus. A test cell is provided that expresses the 
O orphan GPCR as a fusion protein with one p-galactosidase mutant and, for example, an 

- arrestin or mutant form of arrestin as a fusion protein with another P-galactosidase mutant, 
il: The interaction of the arrestin with the orphan GPCR upon receptor activation is measured by 
15 enzymatic activity of the complemented P-galactosidase. The test cell is exposed to a test 
compound, and an increase in p-galactosidase activity indicates the presence of a ligand or 
agonist. 

A further aspect of the present invention is a method for screening a protein of 
interest, for example, an arrestin protein (or mutant form of the arrestin protein) for the ability 
20 to bind to a phosphorylated, or activated, GPCR. A cell is provided that expresses a GPCR 
and contains p-arrestin. The cell is exposed to a known GPCR agonist and then reporter 
enzyme activity is detected. Increased reporter enzyme activity indicates that the p-arrestin 
molecule can bind to phosphorylated, or activated, GPCR in the test cell. 
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A further aspect of the present invention is a method to screen for an agonist to a 
specific GPCR. The agonist could be contained in natural or synthetic libraries or could be a 
physical stimulus. A test cell is provided that expresses a GPCR as a fusion protein with one 
P-galactosidase mutant and, for example, an arrestin as a fusion protein with another p- 
5 galactosidase mutant. The interaction of arrestin with the GPCR upon receptor activation is 
measured by enzymatic activity of the complemented p-galactosidase. The test cell is 
exposed to a test compound, and an increase in p-galactosidase activity indicates the presence 
of an agonist. The test cell may express a known GPCR or a variety of known GPCRs, or 
may express an unknown GPCR or a variety of unknown GPCRs. The GPCR may be, for 
W example, an odorant GPCR or a pAR GPCR. 

— A further aspect of the present invention is a method of screening a test compound for 

^ G-protein-coupled receptor (GPCR) antagonist activity. A test cell is provided that expresses 
f ... a GPCR as a fusion protein with one p-galactosidase mutant and, for example, an arrestin as a 
G fusion protein with another p-galactosidase mutant. The interaction of arrestin with the 
15 GPCR upon receptor activation is measured by enzymatic activity of the complemented p- 

galactosidase. The test cell is exposed to a test compound, and an increase in p-galactosidase 
activity indicates the presence of an agonist. The cell is exposed to a test compound and to a 
GPCR agonist, and reporter enzyme activity is detected. When exposure to the agonist occurs 
at the same time as or subsequent to exposure to the test compound, a decrease in p- 
20 galactosidase activity after exposure to the test compound indicates that the test compound 
has antagonist activity to the GPCR. 

A further aspect of the present invention is a method of screening a sample solution 
for the presence of an agonist, antagonist or ligand to a G-protein-coupled receptor (GPCR). 
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A test cell is provided that expresses a GPCR fusion and contains, for example, a (3-arrestin 
protein fusion. The test cell is exposed to a sample solution, and reporter enzyme activity is 
assessed. Changed reporter enzyme activity after exposure to the sample solution indicates 
the sample solution contains an agonist, antagonist or ligand for a GPCR expressed in the cell 
5 A further aspect of the present invention is a method of screening a cell for the 

presence of a G-protein-coupled receptor (GPCR). 

A further aspect of the present invention is a method of screening a plurality of cells 
£ *i for those cells which contain a G-protein coupled receptor (GPCR). 

jv; A further aspect of the invention is a method for mapping GPCR-mediated signaling 

,t0 pathways. For instance, the system could be utilized to monitor interaction of c-src with p- 
arrestin-1 upon GPCR activation. Additionally, the system could be used to monitor 
protein/protein interactions involved in cross-talk between GPCR signaling pathways and 
other pathways such as that of the receptor tyrosine kinases or Ras/Raf 

O A further aspect of the invention is a method for monitoring homo- or hetero- 

15 dimerization of GPCRs upon agonist or antagonist stimulation. 

A further aspect of the invention is a method of screening a cell for the presence of a 
G-protein-coupled receptor (GPCR) responsive to a GPCR agonist. A cell is provided that 
contains protein partners that interact downstream in the GPCR's pathway. The protein 
partners are expressed as fusion proteins to the mutant, complementing enzyme and are used 

20 to monitor activation of the GPCR. The cell is exposed to a GPCR agonist and then 

enzymatic activity of the reporter enzyme is detected. Increased reporter enzyme activity 
indicates that the cell contains a GPCR responsive to the agonist. 
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The invention is achieved by using ICAST™ protein/protein interaction screening to 
map signaling pathways. This technology is applicable to a variety of known and unknown 
GPCRs with diverse functions. They include, but are not limited to, the following sub- 
families ofGPCRs: 

(a) receptors that bind to amine-like ligands- Acetylcholine muscarinic receptor (Ml 
to M5), alpha and beta Adrenoceptors, Dopamine receptors (Dl, D2, D3 and D4), Histamine 
receptors (HI and H2), Octopamine receptor and Serotonin receptors (5HT1, 5HT2, 5HT4, 
5HT5, 5HT6, 5HT7); 

(b) receptors that bind to a peptide ligand- Angiotensin receptor, Bombesin receptor, 
Bradykinin receptor, C-C chemokine receptors (CCR1 to CCR8, and CCR10), C-X-C type 
Chemokine receptors (CXC-R5), Cholecystokinin type A receptor, CCK type receptors, 
Endothelin receptor, Neurotesin receptor, FMLP-related receptors, Somatostatin receptors 
(type 1 to type 5) and Opioid receptors (type D, K, M, X); 

(c) receptors that bind to hormone proteins- Follic stimulating hormone receptor, 
Thyrotropin receptor and Lutropin-choriogonadotropic hormone receptor; 

(d) receptors that bind to neurotransmitters-sub stance P receptor, Substance K 
receptor and neuropeptide Y receptor; 

(e) Olfactory receptors-Olfactory type 1 to type 1 1, Gustatory and odorant receptors; 

(f) Prostanoid receptors-Prostaglandin E2 (EP1 to EP4 subtypes), Prostacyclin and 
Thromboxane; 

(g) receptors that bind to metabotropic substances-Metabotropic glutamate group I to 
group III receptors; 



(h) receptors that respond to physical stimuli, such as light, or to chemical stimuli, 
such as taste and smell; and 

(i) orphan GPCRs-the natural ligand to the receptor is undefined. 

ICAST™ provides many benefits to the screening process, including the ability to 
monitor protein interactions in any sub-cellular compartment-membrane, cytosol and nucleus; 
the ability to achieve a more physiologically relevant model without requiring protein 
overexpression; and the ability to achieve a functional assay for receptor binding allowing 
high information content. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1. Cellular expression levels of p2 adrenergic receptor (p2AR) and P- 
arrestin-2 (pArr2) in C2 clones. Quantification of p-gal fusion protein was performed using 
antibodies against P-gal and purified P-gal protein in a titration curve by a standardized 
ELISA assay. Figure 1 A shows expression levels of p2AR~PgalAa clones (in expression 
vector pICAST ALC). Figure IB shows expression levels of pArr2-pgalAa> in expression 
vector pICAST OMC4 for clones 9-3, -7, -9, -10, -19 and -24, or in expression vector 
pICAST OMN4 for clones 12-4, -9, -16, -18, -22 and -24. 

FIGURE 2. Receptor p2AR activation was measured by agonist-stimulated cAMP 
production. C2 cells expressing pICAST ALC p2AR (clone 5) or parental cells were treated 
with increasing concentrations of (-)isoproterenol and 0.1 mM IBMX. The quantification of 
cAMP level was expressed as pmol/well. 



FIGURE 3. Interaction of activated receptor p2AR and arrestin can be measured by 
p-galactosidase complementation. Figure 3 A shows a time course of p-galactosidase activity 
in response to agonist (-)isoproterenol stimulation in C2 expressing p2AR-pgalAa (p2AR 
alone, in expression vector pICAST ALC), or C2 clones, and a pool of C2 co-expressing 
p2AR-pgalAot and pArr2-PgalAco (in expression vectors pICAST ALC and pICAST OMC). 
Figure 3B shows a time course of P galactosidase activity in response to agonist 
(-)isoproterenol stimulation in C2 cells expressing P2AR alone (in expression vector pICAST 
ALC) and C2 clones co-expressing P2AR and pArrl (in expression vectors ICAST ALC and 
pICAST OMC). 

FIGURE 4. Agonist dose response for interaction of p2AR and arrestin can be 
measured by P-galactosidase complementation. Figure 4A shows a dose response to agonists 
(-)isoproterenol and procaterol in C2 cells co-expressing pICAST ALC P2AR and pICAST 
OMC pArr2 fusion constructs. Figure 4B shows a dose response to agonists (-)isoproterenol 
and procaterol in C2 cells co-expressing pICAST ALC p2AR and pICAST OMC pArrl 
fusion constructs. 

FIGURE 5. Antagonist mediated inhibition of receptor activity can be measured by 
p-galactosidase complementation in cells co-expressing p2AR-PgalAa and pArr-pgalAco. 
Figure 5A shows specific inhibition with adrenergic antagonists ICI-1 18,551 and propranolol 
of p-galactosidase activity in C2 clones co-expressing pICAST ALC P2AR and pICAST 
OMC p Arr2 fusion constructs after incubation with agonist (^isoproterenol. Figure 5B 
shows specific inhibition of p-galactosidase activity with adrenergic antagonists ICI-1 18,551 
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and propranolol in C2 clones co-expressing pICAST ALC p2AR and pICAST OMC pArrl 
fusion constructs in the presence of agonist (-)isoproterenol. 

FIGURE 6. C2 cells expressing adenosine receptor A2a show cAMP induction in 
response to agonist (CGC-21680) treatment. C2 parental cells and C2 cells co-expressing 
pICAST ALC A2aR and pICAST OMC p Arrl as a pool or as selected clones were measured 
for agonist-induced cAMP response (pmol/well). 

FIGURE 7. Agonist stimulated cAMP response in C2 cells co-expressing Dopamine 
receptor Dl (Dl-PgalAa) and p-arrestin-2 (pArr2-PgalAco). The clone expressing pArr2- 
pgalA© (Arr2 alone) was used as a negative control in the assay. Cells expressing Dl- 
pgalAoc in addition to p Arr2-pgalAco responded agonist treatment (3-hydroxytyramine 
hydrochloride at 3 pM) . D1(PIC2) or D1(PIC3) designate Dl in expression vector pICAST 
ALC2 or pICAST ALC4, respectively. 

FIGURE 8. Variety of mammalian cell lines can be used to generate stable cells for 
monitoring GPCR and arrestin interactions. FIGURE 8A, FIGURE 8B and FIGURE 8C show 
the examples of HEK293, CHO and CHW cell lines co-expressing adrenergic receptor p2AR 
and arrestin fusion proteins of P-galactosidase mutants. The P-galactosidase activity was used 
to monitor agonist-induced interaction of P2AR and arrestin proteins. 

FIGURE 9. Beta-gal complementation can be used to monitor P2 adrenergic receptor 
homo-dimerization. FIGURE 9A shows p-galactosidase activity in HEK293 clones co- 
expressing pICAST ALC p2AR and pICAST OMC p2AR. FIGURE 9B shows a cAMP 
response to agonist (-)isoproterenol in HEK 293 clones co-expressing pICAST ALC p2AR 



and pICAST OMC P2AR. HEK293 parental cells were included in the assays as negative 
controls. 

FIGURE 10A. pICAST ALC: Vector for expression of P-galAa as a C-terminal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the P-galAa; GS Linker, (GGGGS)n; 
NeoR, neomycin resistance gene; IRES, internal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 

FIGURE 10B. Nucleotide sequence for pICAST ALC. 

FIGURE 1 1 A. pICAST ALN: Vector for expression of P-galAa as an N-terminal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the P-galAa; GS Linker, (GGGGS)n; 
NeoR, neomycin resistance gene; IRES, internal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3 ? MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 

FIGURE 1 IB. Nucleotide sequence for pICAST ALN. 

FIGURE 12A. pICAST OMC: Vector for expression of P-galAo as a C-terminal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the P-galAco ; GS Linker, (GGGGS)n; 
Hygro, hygromycin resistance gene; IRES, internal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 
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FIGURE 12B. Nucleotide sequence forpICAST OMC. 

FIGURE 13 A. pICAST OMN: Vector for expression of p-galAco as an N-terminal 
fusion to the target protein. This construct contains the following features: MCS, multiple 
cloning site for cloning the target protein in frame with the p-galAco; GS Linker, (GGGGS)n; 
Hygro, hygromycin resistance gene; IRES, internal ribosome entry site; ColElori, origin of 
replication for growth in E. coli; 5'MoMuLV LTR and 3'MoMuLV LTR, viral promotor and 
polyadenylation signals from the Moloney Murine leukemia virus. 

FIGURE 13B. Nucleotide sequence forpICAST OMN. 

FIGURE 14. pICAST ALC pArr2: Vector for expression of p-galAoc as a C-terminal 
fusion to p-arrestin-2. The coding sequence of human p-arrestin-2 (Genebank Accession 
Number: NM_0043 13) was cloned in frame to p-galAa in a pICAST ALC vector. 

FIGURE 15. pICAST OMC pArr2: Vector for expression of p-galAo as a C- 
terminal fusion to p-arrestin-2. The coding sequence of human P-arrestin-2 (Genebank 
Accession Number: NM_004313) was cloned in frame to P-galAco in a pICAST OMC vector. 

FIGURE 16. pICAST ALC pArrl: Vector for expression of P-galAa as a C-terminal 
fusion to p-arrestin-1. The coding sequence of human p-arrestin-1 (Genebank Accession 
Number: NM_004041) was cloned in frame to p-galAa in a pICAST ALC vector. 

FIGURE 17. pICAST OMC pArrl: Vector for expression of p-galAco as a C- 
terminal fusion to p-arrestin-1. The coding sequence of human p-arrestin-1 (Genebank 
Accession Number: NM_004041) was cloned in frame to P-galAco in a pICAST OMC vector. 

FIGURE 18. pICAST ALC p2AR: Vector for expression of p-galAa as a C-terminal 
fusion to p2 Adrenergic Receptor. The coding sequence of human p2 Adrenergic Receptor 
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(Genebank Accession Number: NMJ300024) was cloned in frame to p-galAoc in a pICAST 
ALC vector. 

FIGURE 19. pICAST OMC p2AR: Vector for expression of p-galAco as a C- 
terminal fusion P2 Adrenergic Receptor. The coding sequence of human p2 Adrenergic 
Receptor (Genebank Accession Number: NM_000024) was cloned in frame to P-galAo in a 
pICAST OMC vector. 

FIGURE 20. pICAST ALC A2aR: Vector for expression of p-galAa as a C-terminal 
fusion to Adenosine 2a Receptor. The coding sequence of human Adenosine 2a Receptor 
(Genebank Accession Number: NMJ300675) was cloned in frame to p-galAa in a pICAST 
ALC vector. 

FIGURE 21 . pICAST OMC A2aR: Vector for expression of P-galAco as a C-terminal 
fusion to Adenosine 2a Receptor. The coding sequence of human Adenosine 2a Receptor 
(Genebank Accession Number: NMJ300675) was cloned in frame to P-galAco in a pICAST 
OMC vector. 

FIGURE 22. pICAST ALC Dl : Vector for expression of p-galAa as a C-terminal 
fusion to Dopamine Dl Receptor. The coding sequence of human Dopamine Dl Receptor 
(Genebank Accession Number: X58987) was cloned in frame to P-galAa in a pICAST ALC 
vector. 

FIGURE 23. A schematic depicting the method of the invention, which shows that 
two inactive mutants that become active when they interact. 
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DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

All literature and patents cited in this disclosure are incorporated herein by reference. 

The present invention provides a method to interrogate GPCR function and pathways. 
The G-protein-coupled superfamily continues to expand rapidly as new receptors are 
discovered through automated sequencing of cDNA libraries or genomic DNA. It is 
estimated that several thousand GPCRs may exist in the human genome, as many as 250 
GPCRs have been cloned and only as few as 150 have been associated with ligands. The 
means by which these, or newly discovered orphan receptors, will be associated with their 
cognate ligands and physiological functions represents a major challenge to biological and 
biomedical research. The identification of an orphan receptor generally requires an 
individualized assay and a guess as to its function. The interrogation of a GPCR' s signaling 
behavior by introducing a replacement receptor eliminates these prerequisites because it can 
be performed with and without prior knowledge of other signaling events. It is sensitive, 
rapid and easily performed and should be applicable to nearly all GPCRs because the 
majority of these receptors should desensitize by a common mechanism. 

Various approaches have been used to monitor intracellular activity in response to a 
stimulant, e^, enzyme-linked immunosorbent assay (ELISA); Fluorescense Imaging Plate 
Reader assay (FLIPR™, Molecular Devices Corp., Sunnyvale, CA); EVOscreen™, 
EVOTEC™, Evotec Biosystems Gmbh, Hamburg, Germany; and techniques developed by 
CELLOMICS™, Cellomics, Inc., Pittsburgh, PA. 

Germino, F.J., et al. . "Screening for in vivo protein-protein interactions." Proc. Natl. 
Acad. Sci., 90(3): 933-7 (1993), discloses an in vivo approach for the isolation of proteins 
interacting with a protein of interest. 
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Phizickv, E.M.. et ah . "Protein-protein interactions: methods for detection and 
analysis." Microbiol. Rev., 59(1): 94-123 (1995), discloses a review of biochemical, 
molecular biological and genetic methods used to study protein-protein interactions. 

Offermanns. et al» "Ga 15 and Gct 16 Couple a Wide Variety of Receptors to 
Phospholipase C." J. Biol. Chem., 270(25): 15175-80 (1995), discloses that Ga 15 and Ga 16 can 
be activated by a wide variety of G-protein-coupled receptors. The selective coupling of an 
activated receptor to a distinct pattern of G-proteins is regarded as an important requirement 
to achieve accurate signal transduction. Id. 

Barak et al.. "A p-arrestin/Green Fluorescent Protein Biosensor for Detecting G 
Protein-coupled Receptor Activation." J. Biol. Chem., 272(44) :27497-500 (1997) and U.S. 
Patent No. 5,891,646, disclose the use of a p-arrestin/green fluorescent fusion protein (GFP) 
to monitor protein translocation upon stimulation of GPCR. 

The present invention involves a method for monitoring protein-protein interactions in 
GPCR pathways as a complete assay using ICAST™ (Intercistronic Complementation 
Analysis Screening Technology as disclosed in pending U.S. patent application serial no. 
053,164, filed April 1, 1998, the entire contents of which are incorporated herein by 
reference). This invention enables an array of assays, including GPCR binding assays, to be 
achieved directly within the cellular environment in a rapid, non-radioactive assay format 
amenable to high-throughput screening. Using existing technology, assays of this type are 
currently performed in a non-cellular environment and require the use of radioisotopes. 

The present invention combined with Tropix ICAST™ and Advanced Discovery 
Sciences™ technologies, e^, ultra high-throughput screening, provide highly sensitive cell- 
based methods for interrogating GPCR pathways which are amendable to high-throughput 
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screening (HTS). These methods are an advancement over the invention disclosed in U.S. 
Patent 5,891,646, which relies on microscopic imaging of GPCR components as fusion with 
Green-fluorescent-protein. Imaging techniques are limited by low-throughput, lack of 
thorough quantification and low signal to noise ratios. Unlike yeast-based-2-hybrid assays 
used to monitor protein/protein interactions in high-throughput assays, the present invention 
is applicable to a variety of cells including mammalian cells, plant cells, protozoa cells such 
as E. coli and cells of invertebrate origin such as yeast, slime mold {Dictyosteliutri) and 
insects; detects interactions at the site of the receptor target or downstream target proteins 
rather than in the nucleus; and does not rely on indirect read-outs such as transcriptional 
activation. The present invention provides assays with greater physiological relevance and 
fewer false negatives. 

Advanced Discovery Sciences™ is in the business of offering custom-developed 
screening assays optimized for individual assay requirements and validated for automation. 
These assays are designed by HTS experts to deliver superior assay performance. Advanced 
Discovery Sciences'™ custom assay development service encompasses the design, 
development, optimization and transfer of high performance screening assays. Advanced 
Discovery Sciences™ works to design new assays or convert existing assays to ultra-sensitive 
luminescent assays ready for the rigors of HTS. Among some of the technologies developed 
by Advanced Discovery Sciences™ are the cAMP-Screen™ immunoassay system. This 
system provides ultrasensitive determination of cAMP levels in cell lysates. The 
cAMP-Screen™ assay utilizes the high-sensitivity chemiluminescent alkaline phosphatase 
(AP) substrate CSPD® with Sapphire-II™ luminescence enhancer. 
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EXAMPLE: 

GPCR activation can be measured through monitoring the binding of ligand-activated 
GPCR by an arrestin. In this assay system, a GPCR, e.g. p adrenergic receptor (P 2AR) and a 
P arrestin are co-expressed in the same cell as fusion proteins with p gal mutants. As 
illustrated in Figure 1, the p2AR is expressed as a fusion protein with Aa form of p gal 
mutant (p2ADRAa) and the b arrestin as a fusion protein with the Aco mutant of p gal (p- 
ArrAco). The two fusion proteins exist inside of a resting (or un-stimulated) cell in separate 
compartments, i.e. membrane for GPCR and cytosol for arrestin, and they can not form an 
active b galactosidase enzyme. When such a cell is treated with an agonist or a ligand, the 
ligand-occupied and activated receptor will become a high affinity binding site for Arrestin. 
The interaction between an activated p2ADRAa and p-ArrAco drives the P gal gal mutant 
complementation. The enzyme activity can be measured by using an enzyme substrate, 
which upon cleavage releases a product measurable by colorimetry, fluorescence, 
chemiluminescence (e.g. Tropix product GalScreenTM). 

Experiment protocol- 

1. In the first step, the expression vectors for p2ADRAa and pArr2Aco were 
engineered in selectable retroviral vectors pICAST ALC, as described in Figure 18 and 
pICAST OMC, as in Figure 15. 

2. In the second step, the two expression constructs were transduced into either 
C2C12 myoblast cells, or other mammalian cell lines, such as COS-7, CHO, A431, HEK 293, 
and CHW. Following selection with antibiotic drugs, stable clones expressing both fusion 
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proteins at appropriate levels were selected. 

3. In the last step, the cells expressing both p2ADRAa and (3Arr2Ao) were tested for 
response by agonist/ligand stimulated p galactosidase activity. Triplicate samples of cells 
were plated at 10,000 cells in 100 microliter volume into a well of 96-well culture plate. Cells 
were cultured for 24 hours before assay. For agonist assay (Figure 3 and 4), cells were treated 
with variable concentrations of agonist, for example, (-) isoproterenol, procaterol, 
dobutamine, terbutiline or L-L-phenylephrine for 60 min at 37 C. The induced P galatosidase 
activity was measured by addition of Tropix GalScreenTM substrate (Applied Biosystems) 
and luminescence measured in a Tropix TR717TM luminometer (Applied Biosystems). For 
antagonist assay (Figure 5), cells were pre-incubated for 10 min in fresh medium without 
serum in the presence of ICI-1 18,551 or propranolol followed by addition of 10 micro molar 
(-) isoproterenol. 

The assays of this invention, and their application and preparation have been 
described both genetically, and by specific example. The examples are not intended as 
limiting. Other substituent identities, characteristics and assays will occur to those of 
ordinary skill in the art, without the exercise of inventive faculty. Such modifications remain 
within the scope of the invention, unless excluded by the express recitation of the claims 
advanced below. 
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WHAT IS CLAIMED IS: 

1 . A method of assessing the effect of a test condition on G-protein-coupled receptor 
(GPCR) pathway activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an interacting protein partner as a fusion to another mutant form of 
enzyme; 

b) exposing the cell to a ligand for said GPCR under said test condition; and 

c) monitoring activation of said GPCR by complementation of said reporter enzyme; 
wherein increased reporter enzyme activity in the cell compared to that which occurs 

in the absence of said test condition indicates increased GPCR interaction with its interacting 
protein partner compared to that which occurs in the absence of said test condition, and 
decreased reporter enzyme activity in the cell compared to that which occurs in the absence of 
said test condition indicates decreased GPCR interaction with its interacting protein partner 
compared to that which occurs in the absence of said test condition. 

2. A method according to Claim 1, wherein the test condition is the presence in the 
cell of a kinase. 

3. A method according to Claim 1, wherein the test condition is the presence in the 
cell of a G-protein. 

4. A method according to Claim 1, wherein the test condition is the exposure of the 
cell to a compound selected from GPCR agonists and GPCR antagonists. 

5. A method according to Claim 1, wherein the test condition is co-expression in the 
cell of a second receptor. 

6. A method according to Claim 5, wherein the second receptor is a GPCR receptor. 



-20- 



7. A method according to Claim 5, wherein homo-dimerization of GPCR is 
determined. 

8. A method according to Claim 5, wherein hetero-dimerization of GPCR is 
determined. 

9. A method for screening a p-arrestin protein or an unidentified arrestin or arrestin- 
like protein or fragment and mutant form thereof for the ability to bind to activated GPCRs, 
comprising: 

a) providing a cell that: 

i) expresses at least one GPCR as a fusion protein to a reporter enzyme; and 

ii) contains a conjugate comprising a test p-arrestin protein as a fusion protein 
with another reporter enzyme; 

b) exposing the cell to a ligand for said at least one GPCR; and 

c) detecting enzymatic activity of the complemented reporter enzyme; 
wherein an increase in enzymatic activity in the cell indicates P-arrestin protein 

binding to the activated GPCR. 

10. A method for screening a test compound for G-protein-coupled receptor (GPCR) 
agonist activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an arrestin protein as a fusion to another mutant form of enzyme; 

b) exposing the cell to a test compound; and 

c) detecting complementation of said reporter enzyme; 

wherein increased reporter enzyme activity after exposure of the cell to the test 
compound indicates GPCR agonist activity of the test compound. 
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1 1. A method according to Claim 10, wherein the cell expresses a GPCR whose 
function is known. 

12. A method according to Claim 10, wherein the cell expresses a GPCR whose 
function is unknown. 

13. A method according to Claim 10, wherein the cell expresses an odorant or taste 

GPCR. 

14. A method according to Claim 10, wherein the cell expresses a GPCR a p- 
adrenergic GPCR. 

15. A method according to Claim 10, wherein the cell is selected from the group 
consisting of mammalian cells, cells of invertebrate origin, plant cells and protozoa cells. 

16. A method according to Claim 10, wherein the cell endogenously expresses a 

GPCR. 

17. A method according to Claim 10, wherein the cell has been transformed to 
express a GPCR not endogenously expressed by such a cell. 

18. A method of screening a test compound for G-protein-coupled receptor (GPCR) 
antagonist activity, comprising: 

a) providing a cell that expresses a GPCR as a fusion protein to one mutant form of 
reporter enzyme and an arrestin protein as a fusion to another mutant form of enzyme; 

b) exposing the cell to said test compound; 

c) exposing the cell to an agonist for said GPCR; and 

d) detecting complementation of said reporter enzyme; 

where exposure to the agonist occurs at the same time as, or subsequent to, exposure 
to the test compound, and wherein decreased reporter enzyme activity after exposure of the 
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cell to the test compound Indicates that the test compound is an antagonist for said GPCR. 

19. A method of screening a cell for the presence of a G-protein-coupled receptor 
(GPCR) responsive to a GPCR agonist, comprising: 

a) providing a cell, said cell containing a conjugate comprising a p-arrestin protein as 
a fusion protein with a reporter enzyme; 

b) exposing the cell to a GPCR agonist; and 

c) detecting enzymatic activity of the reporter enzyme; 

wherein an increase in enzymatic activity after exposure of the cell to the GPCR 
agonist indicates that the cell contains a GPCR responsive to said agonist. 

20. A method of screening a plurality of cells for those cells which contain a G- 
protein-coupled receptor (GPCR) responsive to a GPCR agonist, comprising: 

a) providing a plurality of cells, said cells containing a conjugate comprising a 
P-arrestin protein as a fusion protein with a reporter enzyme; 

b) exposing the cells to a GPCR agonist; and 

c) detecting enzymatic activity of the reporter enzyme; 

wherein an increase in enzymatic activity after exposure to the GPCR agonist 
indicates p-arrestin protein binding to a GPCR, thereby indicating that the cell contains a 
GPCR responsive to said GPCR agonist. 

21 . A method according to Claim 20, wherein the plurality of cells are contained in a 

tissue. 

22. A method according to Claim 20, wherein the plurality of cells are contained in 
an organ. 
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23. A method according to Claim 20, wherein step (b) comprises exposing the cells to 
a plurality of GPCR agonists or ligand libraries. 

24. A substrate having deposited thereon a plurality of cells, said cells expressing at 
least one GPCR as a fusion protein to one mutant form of reporter enzyme and an arrestin 
protein as a fusion to another mutant form of enzyme. 

25. A substrate according to Claim 24, wherein the substrate contains an enzyme- 
labile chemical group which, upon cleavage by the reporter enzyme, releases a product 
measurable by colorimetry, fluorescence or chemiluminescence. 

26. A substrate according to Claim 24, wherein the substrate is made of a material 
selected from glass, plastic, ceramic, semiconductor, silica, fiber optic, diamond, 
biocompatible monomer and biocompatible polymer materials. 

27. A method of detecting G-protein-coupled receptor (GPCR) pathway activity in a 
cell expressing at least one GPCR and containing P-arrestin protein as a fusion protein with a 
reporter enzyme; wherein said enzymatic activity indicates activation of the GPCR pathway. 

28. A method according to Claim 27, where the cells are deposited on a substrate 
prior to detecting said enzymatic activity. 

29. A method according to Claim 27, wherein said cell is contained in a tissue. 

30. A method according to Claim 27, wherein said cell is contained in an organ. 
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ABSTRACT 

Methods for detecting G-protein coupled receptor (GPCR) activity; methods of 
assaying GPCR activity; and methods of screening for GPCR ligands, G-protein-coupled 
receptor kinase (GRK) activity, and compounds that interact with components of the GPRC 
regulatory process are described. 
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Cellular expression of [3Arr2-pgalAco fusion protein in C2 clones 
(measured by anti-p gal ELISA) 
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Agonist Stimulated cAMP Response in C2 Cells Expressing p2AR-(3galA< 
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p-galactosidase Complementation as a Measurement for (32AR-pgalAa 
interacting with pArrestin2-(3galAco upon agonist Stimulation 
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p-galactosidase Complementation as a Measurement for p2AR-(3galAa 
Interaction with (3Arrestin1-pgalAco upon Agonist Stimulation 
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FIGURE 3B 



(3-galactosidase Activity in Response to Agonist in C2 Cells 
Coexpressing (32AR-PgalAot and pArrestin2-pgalAco Fusion Proteins 
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FIGURE 4A 



(3-galactosidase Activity in Response to Agonist in C2 Cells 
Coexpressing p2AR-(3galAa and pArrestin1-(3galA(o Fusion Proteins 
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Inhibition of (3-galactosidase activity in C2 Ceils Coexpressing 
p2AR-pgaIAa and (3Arrestin2-pgalA(D Fusion Proteins 
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FIGURE 5A 



Antagonist Inhibition of (3-galactosidase Activity in C2 Cells 
Coexpressing p2AR-(3galAoc and (3Arrestin1-PgalAco Fusion Proteins 
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Agonist Stimulated cAMP Response in Clones or Pools of C2 Cells 
Coexpressing A2aR-|3galAa and (3Arrestin1-(3gaIAco Fusion Proteins 
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FIGURE 6 



Agonist Stimulated cAMP Response in Clones or Pools of C2 Cells 
Expressing D1-pgalAcc and pArrestin2-pgalAco Fusion Proteins 
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FIGURE 7 



p 2 AR-pgalAco and parr2-|3galAa Interaction in HEK293 
Clones in Response to Isoproterenol Treatment (1 jiM) 
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FIGURE 8A 



p2AR-pgalAa and pAni-pgalA Interaction in a CHO Pool 
in Response to Isoproterenol Treatment^ OuM) 
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FIGURE 8B 



p2AR-(3galAa and (3Arr2-(3galA© Interaction in CHW Clone 
in Response to Isoproterenol Treatment (10uM) 



3000 

2500 

S 2000 
_i 

CE 1500 
elf* 

1000 
^ 500 



H Clone 70-4 

Parental CHW 



no ag 



5 



10 20 

Time (min) 



30 



40 



60 



FIGURE 8C 



(3-galactosidase Complementation as a Measurement for 
Adrenergic Receptor Homodimerization in HEK 293 Cells 

Coexpressing p2AR-pgalAcc and f32AR-f3galAoo. 
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Agonist Stimulated cAMP Response in HEK 293 Cells 
Coexpressing p2AR-pgalAa and (32AR-(3galAoo 
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951 TCCCTTAAGT TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC 
AGGGAATTCA AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG 



1001 ACAACCAGTC GGTAGATGTC AAGAAGAGAC GTTGGGTTAC CTTCTGCTCT 
TGTTGGTCAG CCATCTACAG TTCTTCTCTG CAACCCAATG GAAGACGAGA 



1051 GCAGAATGGC CAACCTTTAA CGTCGGATGG CCGCGAGACG GCACCTTTAA 
CGTCTTACCG GTTGGAAATT GCAGCCTACC GGCGCTCTGC CGTGGAAATT 



1101 CCGAGACCTC ATCACCCAGG TTAAGATCAA GGTCTTTTCA CCTGGCCCGC 
GGCTCTGGAG TAGTGGGTCC AATTCTAGTT CCAGAAAAGT GGACCGGGCG 



1151 ATGGACACCC AGACCAGGTC CCCTACATCG TGACCTGGGA AGCCTTGGCT 
TACCTGTGGG TCTGGTCCAG GGGATGTAGC ACTGGACCCT TCGGAACCGA 



1201 TTTGACCCCC CTCCCTGGGT CAAGCCCTTT GTACACCCTA AGCCTCCGCC 
AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG 



12515 TCCTCTTCCT CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA 
Ul AGGAGAAGGA GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT 



13GB CCCCGCCTCG ATCCTCCCTT TATCCAGCCC TCACTCCTTC TCTAGGCGCC 
s ?« GGGGCGGAGC TAGGAGGGAA ATAGGTCGGG AGTGAGGAAG AGATCCGCGG 
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1551 GGTTTCCGGC ACCAGAAGCG GTGCCGGAAA GCTGGCTGGA GTGCGATCTT 
CCAAAGGCCG TGGTCTTCGC CACGGCCTTT CGACCGACCT CACGCTAGAA 



+ 2PEAD TVV VPS NWQM HGY 



1601 CCTGAGGCCG ATACTGTCGT CGTCCCCTCA AACTGGCAGA TGCACGGTTA 
GGACTCCGGC TATGACAGCA GCAGGGGAGT TTGACCGTCT ACGTGCCAAT 



+2 DAP IYTN VTY PIT VNP 



1651 CGATGCGCCC ATCTACACCA ACGTGACCTA TCCCATTACG GTCAATCCGC 
GCTACGCGGG TAGATGTGGT TGCACTGGAT AGGGTAATGC CAGTTAGGCG 
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+ 2 P F V P TEN PTGC YSL T F N 



17 01 CGTTTGTTCC CACGGAGAAT CCGACGGGTT GTTACTCGCT CACATTTAAT 
GCAAACAAGG GTGCCTCTTA GGCTGCCCAA CAATGAGCGA GTGTAAATTA 



+2VDES WLQ EGQ TRII FDG 



1751 GTTGATGAAA GCTGGCTACA GGAAGGCCAG ACGCGAATTA TTTTTGATGG 
CAACTACTTT CGACCGATGT CCTTCCGGTC TGCGCTTAAT AAAAACTACC 



+ 2 VNS AFHL WCN GRW VGY 



1801 CGTTAACTCG GCGTTTCATC TGTGGTGCAA CGGGCGCTGG GTCGGTTACG 
GCAATTGAGC CGCAAAGTAG ACACCACGTT GCCCGCGACC CAGCCAATGC 



+ 2GQDS RLP SEFD LSA FLR 



1851 GCCAGGACAG TCGTTTGCCG TCTGAATTTG ACCTGAGCGC ATTTTTACGC 
CGGTCCTGTC AGCAAACGGC AGACTTAAAC TGGACTCGCG TAAAAATGCG 



iiAGEN RLA VMV LRWS DGS 



190 in GCCGGAGAAA ACCGCCTCGC GGTGATGGTG CTGCGCTGGA GTGACGGCAG 
CGGCCTCTTT TGGCGGAGCG CCACTACCAC GACGCGACCT CACTGCCGTC 



+S YLE DQDM WRM SGI FRD 



195iU TTATCTGGAA GATCAGGATA TGTGGCGGAT GAGCGGCATT TTCCGTGACG 
s AATAGACCTT CTAGTCCTAT ACACCGCCTA CTCGCCGTAA AAGGCACTGC 



fcjj V S L L HKP TTQI SDF H V A 

200lT TCTCGTTGCT GCATAAACCG AC T AC AC AAA TCAGCGATTT CCATGTTGCC 
lj Z AGAGCAACGA CGTATTTGGC TGATGTGTTT AGTCGCTAAA GGTACAACGG 



%k T R F N DDF SRA VLEA EVQ 



2051 ACTCGCTTTA AT GAT GAT TT CAGCCGCGCT GTACTGGAGG CTGAAGTTCA 
TGAGCGAAAT TACT ACT AAA GTCGGCGCGA CATGACCTCC GACTTCAAGT 



+2 MCG ELRD YLR VTV SLW 



2101 GATGTGCGGC GAGTTGCGTG ACTACCTACG GGTAACAGTT TCTTTATGGC 
CTACACGCCG CTCAACGCAC TGATGGATGC CCATTGTCAA AGAAATACCG 



+2QGET Q V A SGTA PFG GEI 



2151 AGGGTGAAAC GCAGGTCGCC AGCGGCACCG CGCCTTTCGG CGGTGAAATT 
TCCCACTTTG CGTCCAGCGG TCGCCGTGGC GCGGAAAGCC GCCACTTTAA 



+ 2IDER GGY ADR VTLR LNV 



2201 ATCGATGAGC GTGGTGGTTA TGCCGATCGC GTCACACTAC GTCTGAACGT 
TAGCTACTCG CACCACCAAT ACGGCTAGCG CAGTGTGATG CAGACTTGCA 



+2 ENP KLWS A E I PNL YRA 



2251 CGAAAACCCG AAACTGTGGA GCGCCGAAAT CCCGAATCTC TATCGTGCGG 
GCTTTTGGGC TTTGACACCT CGCGGCTTTA GGGCTTAGAG ATAGCACGCC 
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+ 2VVEL HTA DGTL I E A E A C 



2301 TGGTTGAACT GCACACCGCC GACGGCACGC TGATTGAAGC AGAAGCCTGC 
ACCAACTTGA CGTGTGGCGG CTGCCGTGCG ACTAACTTCG TCTTCGGACG 



+ 2DVGF REV RIE NGLL LLN 



2351 GATGTCGGTT TCCGCGAGGT GCGGATTGAA AATGGTCTGC TGCTGCTGAA 
CTACAGCCAA AGGCGCTCCA CGCCTAACTT TTACCAGACG ACGACGACTT 



+2 GKP LLIR GVN RHE HHP 



2401 CGGCAAGCCG TTGCTGATTC GAGGCGTTAA CCGTCACGAG CATCATCCTC 
GCCGTTCGGC AACGACTAAG CTCCGCAATT GGCAGTGCTC GTAGTAGGAG 



+ 2LHGQ VMD EQTM VQD ILL 



2451 TGCATGGTCA GGTCATGGAT GAGCAGACGA TGGTGCAGGA TATCCTGCTG 
ACGTACCAGT CCAGTACCTA CTCGTCTGCT ACCACGTCCT ATAGGACGAC 



tf? M K Q N NFN A V R CSHY PNH 

2 5 til: ATGAAGCAGA ACAACTTTAA CGCCGTGCGC TGTTCGCATT ATCCGAACCA 
Ul TACTTCGTCT TGTTGAAATT GCGGCACGCG ACAAGCGTAA TAGGCTTGGT 



: +2 PLW YTLC DRY GLY VVD 



25511 TCCGCTGTGG TACACGCTGT GCGACCGCTA CGGCCTGTAT GTGGTGGATG 
AGGCGACACC ATGTGCGACA CGCTGGCGAT GCCGGACATA CACCACCTAC 



35 E A N I ETH GMVP MNR LTD 

26(10 AAGCCAATAT TGAAACCCAC GGCATGGTGC CAATGAATCG TCTGACCGAT 
'hh TTCGGTTATA ACTTTGGGTG CCGTACCACG GTTACTTAGC AGACTGGCTA 



i[f£DPRW LPA MSE RVTR M V Q 



2 651 GATCCGCGCT GGCTACCGGC GATGAGCGAA CGCGTAACGC GAATGGTGCA 
CTAGGCGCGA CCGATGGCCG CTACTCGCTT GCGCATTGCG CTTACCACGT 



+2 RDR NHPS VII WSL GNE 



2701 GCGCGATCGT AATCACCCGA GTGTGATCAT CTGGTCGCTG GGGAATGAAT 
CGCGCTAGCA TTAGTGGGCT C AC AC TAG T A GACCAGCGAC CCCTTACTTA 



+ 2SGHG ANH DALY R W I KSV 



27 51 CAGGCCACGG CGCTAATCAC GACGCGCTGT ATCGCTGGAT CAAATCTGTC 
GTCCGGTGCC GCGATTAGTG CTGCGCGACA TAGCGACCTA GTTTAGACAG 



+ 2DPSR PVQ YEG GGAD TTA 



2801 GATCCTTCCC GCCCGGTGCA GTATGAAGGC GGCGGAGCCG ACACCACGGC 
CTAGGAAGGG CGGGCCACGT CATACTTCCG CCGCCTCGGC TGTGGTGCCG 



+ 2 TDI ICPM YAR VDE DQP 



28 51 CACCGATATT ATTTGCCCGA TGTACGCGCG CGTGGATGAA GACCAGCCCT 
GTGGCTATAA TAAACGGGCT ACATGCGCGC GCACCTACTT CTGGTCGGGA 
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+2 F P A V PKW SIKK WLS LPG 



2901 TCCCGGCTGT GCCGAAATGG TCCATCAAAA AATGGCTTTC GCTACCTGGA 
AGGGCCGACA CGGCTTTACC AGGTAGTTTT TTACCGAAAG CGATGGACCT 



+2ETRP L I L CEY AHAM GNS 



2951 GAGACGCGCC CGCTGATCCT TTGCGAATAC GCCCACGCGA TGGGTAACAG 
CTCTGCGCGG GCGACTAGGA AACGCTTATG CGGGTGCGCT ACCCATTGTC 



+ 2 LGG FAKY WQA FRQ YPR 



3001 TCTTGGCGGT TTCGCTAAAT ACTGGCAGGC GTTTCGTCAG TATCCCCGTT 
AGAACCGCCA AAGCGATTTA TGACCGTCCG CAAAGCAGTC ATAGGGGCAA 



+ 2LQGG FVW DWVD QSL IKY 



3051 TACAGGGCGG CTTCGTCTGG GACTGGGTGG ATCAGTCGCT GATTAAATAT 
ATGTCCCGCC GAAGCAGACC CTGACCCACC TAGTCAGCGA CTAATTTATA 



''4i. DENG NPW SAY GGDF GDT 

31 (to GATGAAAACG GCAACCCGTG GTCGGCTTAC GGCGGTGATT TTGGCGATAC 
*;J CTACTTTTGC CGTTGGGCAC CAGCCGAATG CCGCCACTAA AACCGCTATG 



"4j PND RQFC MNG LVF ADR 

315d3 GCCGAACGAT CGCCAGTTCT GTATGAACGG TCTGGTCTTT GCCGACCGCA 
= CGGCTTGCTA GCGGTCAAGA CATACTTGCC AGACCAGAAA CGGCTGGCGT 



,f?TPHP ALT EAKH QQQ FFQ 

32 (ft? CGCCGCATCC AGCGCTGACG GAAGCAAAAC ACCAGCAGCA GTTTTTCCAG 
H' GCGGCGTAGG TCGCGACTGC CTTCGTTTTG TGGTCGTCGT CAAAAAGGTC 



1+12 F R L S GQT IEV TSEY LFR 



3251 TTCCGTTTAT CCGGGCAAAC CATCGAAGTG ACCAGCGAAT ACCTGTTCCG 
AAGGCAAATA GGCCCGTTTG GTAGCTTCAC TGGTCGCTTA TGGACAAGGC 



+ 2 HSD NELL HWM VAL DGK 



3301 TCATAGCGAT AACGAGCTCC TGCACTGGAT GGTGGCGCTG GATGGTAAGC 
AGTATCGCTA TTGCTCGAGG ACGTGACCTA CCACCGCGAC CTACCATTCG 



+2PLAS GEV PLDV APQ GKQ 



3351 CGCTGGCAAG CGGTGAAGTG CCTCTGGATG TCGCTCCACA AGGTAAACAG 
GCGACCGTTC GCCACTTCAC GGAGACCTAC AGCGAGGTGT TCCATTTGTC 



+ 2LIEL PEL PQP ESAG QLW 



3401 TTGATTGAAC TGCCTGAACT ACCGCAGCCG GAGAGCGCCG GGCAACTCTG 
AACTAACTTG AC GG AC T T G A TGGCGTCGGC CTCTCGCGGC CCGTTGAGAC 



+ 2 LTV RVVQ PNA TAW SEA 



3451 GCTCACAGTA CGCGTAGTGC AACCGAACGC GACCGCATGG TCAGAAGCCG 
CGAGTGTCAT GCGCATCACG TTGGCTTGCG CTGGCGTACC AGTCTTCGGC 
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+ 2GHIS A W Q QWRL AEN LSV 



3501 GGCACATCAG CGCCTGGCAG CAGTGGCGTC TGGCGGAAAA CCTCAGTGTG 
CCGTGTAGTC GCGGACCGTC GTCACCGCAG ACCGCCTTTT GGAGTCACAC 



+2TLPA ASH AIP HLTT SEM 



3551 ACGCTCCCCG CCGCGTCCCA CGCCATCCCG CATCTGACCA CCAGCGAAAT 
TGCGAGGGGC GGCGCAGGGT GCGGTAGGGC GTAGACTGGT GGTCGCTTTA 



+2 DFC IELG NKR WQF NRQ 



3601 GGATTTTTGC ATCGAGCTGG GTAATAAGCG TTGGCAATTT AACCGCCAGT 
CCTAAAAACG TAGCTCGACC CATTATTCGC AACCGTTAAA TTGGCGGTCA 



+ 2SGFL SQM WIGD KKQ LLT 



3651 CAGGCTTTCT TTCACAGATG TGGATTGGCG ATAAAAAACA ACTGCTGACG 
GTCCGAAAGA AAGTGTCTAC ACCTAACCGC TATTTTTTGT TGACGACTGC 



•=£2PLRD QFT RAP LDND IGV 



37 Cft; CCGCTGCGCG ATCAGTTCAC CCGTGCACCG CTGGATAACG ACATTGGCGT 
"Jj GGCGACGCGC TAGTCAAGTG GGCACGTGGC GACCTATTGC TGTAACCGCA 



^2 SEA T R I D P N A W V E R W K 



37 ail AAGTGAAGCG ACCCGCATTG ACCCTAACGC CTGGGTCGAA CGCTGGAAGG 
J TTCACTTCGC TGGGCGTAAC TGGGATTGCG GACCCAGCTT GCGACCTTCC 



If 2 A A G H YQA EAAL LQC TAD 

38(W CGGCGGGCCA TTACCAGGCC GAAGCAGCGT TGTTGCAGTG CACGGCAGAT 
h h GCCGCCCGGT AATGGTCCGG CTTCGTCGCA ACAACGTCAC GTGCCGTCTA 



CI2TLAD A V L ITT AHAW QHQ 



3851 ACACTTGCTG ATGCGGTGCT GATTACGACC GCTCACGCGT GGCAGCATCA 
TGTGAACGAC TACGCCACGA CTAATGCTGG CGAGTGCGCA CCGTCGTAGT 



+ 2 GKT LFIS RKT YRI DGS 



3901 GGGGAAAACC TTATTTATCA GCCGGAAAAC CTACCGGATT GATGGTAGTG 
CCCCTTTTGG AATAAATAGT CGGCCTTTTG GATGGCCTAA CTACCATCAC 



+2GQMA ITV DVEV ASD TPH 



3951 GTCAAATGGC GATTACCGTT GATGTTGAAG TGGCGAGCGA TACACCGCAT 
CAGTTTACCG CTAATGGCAA CTACAACTTC ACCGCTCGCT ATGTGGCGTA 



+2 PARI GLN CQL A Q V A ERV 



4 001 CCGGCGCGGA TTGGCCTGAA CTGCCAGCTG GCGCAGGTAG CAGAGCGGGT 
GGCCGCGCCT AACCGGACTT GACGGTCGAC CGCGTCCATC GTCTCGCCCA 



+ 2 NWL GLGP QEN YPD RLT 



4051 AAACTGGCTC GGATTAGGGC CGCAAGAAAA CTATCCCGAC CGCCTTACTG 
TTTGACCGAG CCTAATCCCG GCGTTCTTTT GATAGGGCTG GCGGAATGAC 
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+ 2 A A C F DRW DLPL SDM YTP 



4101 CCGCCTGTTT TGACCGCTGG GATCTGCCAT TGTCAGACAT GTATACCCCG 
GGCGGACAAA ACTGGCGACC CTAGACGGTA ACAGTCTGTA CATATGGGGC 



+ 2YVFP SEN GLR CGTR ELN 



4151 TACGTCTTCC CGAGCGAAAA CGGTCTGCGC TGCGGGACGC GCGAATTGAA 
ATGCAGAAGG GCTCGCTTTT GCCAGACGCG ACGCCCTGCG CGCTTAACTT 



+ 2 YGP HQWR GDF QFN ISR 



42 01 TTATGGCCCA CACCAGTGGC GCGGCGACTT CCAGTTCAAC ATCAGCCGCT 
AATACCGGGT GTGGTCACCG CGCCGCTGAA GGTCAAGTTG TAGTCGGCGA 



+ 2YSQQ QLM ETSH RHL LHA 



4251 ACAGTCAACA GCAACTGATG GAAACCAGCC ATCGCCATCT GCTGCACGCG 
TGTCAGTTGT CGTTGACTAC CTTTGGTCGG TAGCGGTAGA CGACGTGCGC 



tgEEGT WLN IDG FHMG IGG 

4 30 J-,! GAAGAAGGCA CATGGCTGAA TATCGACGGT TTCCATATGG GGATTGGTGG 
CTTCTTCCGT GTACCGACTT ATAGCTGCCA AAGGTATACC CCTAACCACC 



+2 DDS WSPS VSA EFQ LSA 



4351:; ; CGACGACTCC TGGAGCCCGT CAGTATCGGC GGAATTCCAG CTGAGCGCCG 
/' GCTGCTGAGG ACCTCGGGCA GTCATAGCCG CCTTAAGGTC GACTCGCGGC 



fl2GRYH YQL VWCQ KRS DYK 

440lJ GTCGCTACCA TTACCAGTTG GTCTGGTGTC AAAAAAGATC TGACTATAAA 
CAGCGATGGT AATGGTCAAC CAGACCACAG TTTTTTCTAG ACTGATATTT 



ODEDL DHH HHH HR 



4451 GATGAGGACC TCGACCATCA T CAT CAT CAT CACCGGTAAT AATAGGTAGA 
CTACTCCTGG AGCTGGTAGT AGTAGTAGTA GTGGCCATTA TTATCCATCT 



4501 TAAGTGACTG AT TAG AT G C A TTGATCCCTC GACCAATTCC GGTTATTTTC 
ATTCACTGAC TAATCTACGT AACTAGGGAG CTGGTTAAGG CCAATAAAAG 



4551 CACCATATTG CCGTCTTTTG GCAATGTGAG GGCCCGGAAA CCTGGCCCTG 
GTGGTATAAC GGCAGAAAAC CGTTACACTC CCGGGCCTTT GGACCGGGAC 



4601 TCTTCTTGAC GAGCATTCCT AGGGGTCTTT CCCCTCTCGC CAAAGGAATG 
AGAAGAACTG CTCGTAAGGA TCCCCAGAAA GGGGAGAGCG GTTTCCTTAC 



4651 CAAGGTCTGT TGAATGTCGT GAAGGAAGCA GTTCCTCTGG AAGCTTCTTG 
G T T C C AG AC A ACTTACAGCA CTTCCTTCGT CAAGGAGACC TTCGAAGAAC 



47 01 AAGACAAACA ACGTCTGTAG CGACCCTTTG CAGGCAGCGG AACCCCCCAC 
TTCTGTTTGT TGCAGACATC GCTGGGAAAC GTCCGTCGCC TTGGGGGGTG 



4751 CTGGCGACAG GTGCCTCTGC GGCCAAAAGC CACGTGTATA AGATACACCT 
GACCGCTGTC CACGGAGACG CCGGTTTTCG GTGCACATAT TCTATGTGGA 
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4801 GCAAAGGCGG CACAACCCCA GTGCCACGTT GTGAGTTGGA TAGTTGTGGA 
CGTTTCCGCC GTGTTGGGGT CACGGTGCAA CACTCAACCT ATCAACACCT 



4851 AAGAGTCAAA TGGCTCTCCT CAAGCGTATT CAACAAGGGG CTGAAGGATG 
TTCTCAGTTT ACCGAGAGGA GTTCGCATAA GTTGTTCCCC GACTTCCTAC 



4 901 CCCAGAAGGT ACCCCATTGT ATGGGATCTG ATCTGGGGCC TCGGTGCACA 
GGGTCTTCCA TGGGGTAACA TACCCTAGAC TAGACCCCGG AGCCACGTGT 



4951 TGCTTTACAT GTGTTTAGTC GAGGTTAAAA AACGTCTAGG CCCCCCGAAC 
ACGAAATGTA CACAAAT C AG CTCCAATTTT TTGCAGATCC GGGGGGCTTG 



5001 CACGGGGACG TGGTTTTCCT TTGAAAAACA CGATGATAAT AC CAT GAT TG 
GTGCCCCTGC ACCAAAAGGA AACTTTTTGT GCTACTATTA TGGTACTAAC 



5051 AACAAGATGG ATTGCACGCA GGTTCTCCGG CCGCTTGGGT GGAGAGGCTA 
TTGTTCTACC TAACGTGCGT CCAAGAGGCC GGCGAACCCA CCTCTCCGAT 



5lBl TTCGGCTATG ACTGGGCACA ACAGACAATC GGCTGCTCTG ATGCCGCCGT 
yi AAGCCGATAC TGACCCGTGT TGTCTGTTAG CCGACGAGAC TACGGCGGCA 



5IB1 GTTCCGGCTG TCAGCGCAGG GGCGCCCGGT TCTTTTTGTC AAGACCGACC 
= ?« CAAGGCCGAC AGTCGCGTCC CCGCGGGCCA AGAAAAACAG TTCTGGCTGG 



5pl TGTCCGGTGC CCTGAATGAA CTGCAGGACG AGGCAGCGCG GCTATCGTGG 
% :f ACAGGCCACG GGACTTACTT GACGTCCTGC TCCGTCGCGC CGATAGCACC 



$251 CTGGCCACGA CGGGCGTTCC TTGCGCAGCT GTGCTCGACG TTGTCACTGA 
O GACCGGTGCT GCCCGCAAGG AACGCGTCGA CACGAGCTGC AACAGTGACT 



5§ll AGCGGGAAGG GACTGGCTGC TATTGGGCGA AGTGCCGGGG CAGGATCTCC 
Lh TCGCCCTTCC CTGACCGACG ATAACCCGCT TCACGGCCCC GTCCTAGAGG 



5§§1 TGTCATCTCA CCTTGCTCCT GCCGAGAAAG TATCCATCAT GGCTGATGCA 
ACAGTAGAGT GGAACGAGGA CGGCTCTTTC ATAGGTAGTA CCGACTACGT 



54 01 ATGCGGCGGC TGCATACGCT TGATCCGGCT ACCTGCCCAT TCGACCACCA 
TACGCCGCCG ACGTATGCGA ACTAGGCCGA TGGACGGGTA AGCTGGTGGT 



5451 AGCGAAACAT CGCATCGAGC GAGCACGTAC TCGGATGGAA GCCGGTCTTG 
TCGCTTTGTA GCGTAGCTCG CTCGTGCATG AGCCTACCTT CGGCCAGAAC 



5501 TCGATCAGGA TGATCTGGAC GAAGAGCATC AGGGGCTCGC GCCAGCCGAA 
AGCTAGTCCT ACTAGACCTG CTTCTCGTAG TCCCCGAGCG CGGTCGGCTT 



5551 CTGTTCGCCA GGCTCAAGGC GCGCATGCCC GACGGCGAGG ATCTCGTCGT 
GACAAGCGGT CCGAGTTCCG CGCGTACGGG CTGCCGCTCC TAGAGCAGCA 



5601 GACCCATGGC GATGCCTGCT TGCCGAATAT CATGGTGGAA AATGGCCGCT 
CTGGGTACCG CTACGGACGA ACGGCTTATA GTACCACCTT TTACCGGCGA 



5651 TTTCTGGATT CATCGACTGT GGCCGGCTGG GTGTGGCGGA CCGCTATCAG 
AAAGACCTAA GTAGCTGACA CCGGCCGACC CACACCGCCT GGCGATAGTC 



57 01 GACATAGCGT TGGCTACCCG TGATATTGCT GAAGAGCTTG GCGGCGAATG 
CTGTATCGCA ACCGATGGGC ACTATAACGA CTTCTCGAAC CGCCGCTTAC 
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57 51 GGCTGACCGC TTCCTCGTGC TTTACGGTAT CGCCGCTCCC GATTCGCAGC 
CCGACTGGCG AAGGAGCACG AAATGCCATA GCGGCGAGGG CTAAGCGTCG 



58 01 GCATCGCCTT CTATCGCCTT CTTGACGAGT TCTTCTGAGC GGGACTCTGG 
CGTAGCGGAA GATAGCGGAA GAACTGCTCA AGAAGACTCG CCCTGAGACC 



5851 GGTTCGCATC GATAAAATAA AAGATTTTAT TTAGTCTCCA GAAAAAGGGG 
CCAAGCGTAG CTATTTTATT TTCTAAAATA AATCAGAGGT CTTTTTCCCC 



5901 GGAATGAAAG ACCCCACCTG TAGGTTTGGC AAGCTAGCTT AAGTAACGCC 
CCTTACTTTC TGGGGTGGAC ATCCAAACCG TTCGATCGAA TTCATTGCGG 



5951 ATTTTGCAAG GCATGGAAAA ATACATAACT GAGAATAGAG AAGTTCAGAT 
TAAAACGTTC CGTACCTTTT TATGTATTGA CTCTTATCTC TTCAAGTCTA 



6001 CAAGGTCAGG AAC AG AT GG A ACAGCTGAAT ATGGGCCAAA CAGGATATCT 
GTTCCAGTCC TTGTCTACCT TGTCGACTTA TACCCGGTTT GTCCTATAGA 



e%ll GTGGTAAGCA GTTCCTGCCC CGGCTCAGGG CCAAGAACAG ATGGAACAGC 
yj CACCATTCGT CAAGGACGGG GCCGAGTCCC GGTTCTTGTC TACCTTGTCG 



#01 TGAATATGGG CCAAACAGGA TATCTGTGGT AAGCAGTTCC TGCCCCGGCT 
_ w ;-: ACTTATACCC GGTTTGTCCT ATAGACACCA TTCGTCAAGG ACGGGGCCGA 



ijkl CAGGGCCAAG AACAGATGGT CCCCAGATGC GGTCCAGCCC TCAGCAGTTT 
GTCCCGGTTC TTGTCTACCA GGGGTCTACG CCAGGTCGGG AGTCGTCAAA 



& : 01 CTAGAGAACC ATCAGATGTT TCCAGGGTGC CCCAAGGACC TGAAATGACC 
U GATCTCTTGG TAGTCTACAA AGGTCCCACG GGGTTCCTGG ACTTTACTGG 



I3si CTGTGCCTTA TTTGAACTAA CCAATCAGTT CGCTTCTCGC TTCTGTTCGC 
H GACACGGAAT AAACTTGATT GGTTAGTCAA GCGAAGAGCG AAGACAAGCG 



O01 GCGCTTCTGC TCCCCGAGCT CAATAAAAGA GCCCACAACC CCTCACTCGG 
CGCGAAGACG AGGGGCTCGA GTTATTTTCT CGGGTGTTGG GGAGTGAGCC 



6351 GGCGCCAGTC CTCCGATTGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT 
CCGCGGTCAG GAGGCTAACT GACTCAGCGG GCCCATGGGC ACATAGGTTA 



6401 AAACCCTCTT GCAGTTGCAT CCGACTTGTG GTCTCGCTGT TCCTTGGGAG 
TTTGGGAGAA CGTCAACGTA GGCTGAACAC CAGAGCGACA AGGAACCCTC 



6451 GGTCTCCTCT GAGTGATTGA CTACCCGTCA GCGGGGGTCT TTCATTCATG 
CCAGAGGAGA CTCACTAACT GATGGGCAGT CGCCCCCAGA AAGTAAGTAC 



6501 CAGCATGTAT CAAAATTAAT TTGGTTTTTT TTCTTAAGTA TTTACATTAA 
GTCGTACATA GTTTTAATTA AACCAAAAAA AAGAATTCAT AAATGTAATT 



6551 ATGGCCATAG TTGCATTAAT GAATCGGCCA ACGCGCGGGG AGAGGCGGTT 
TACCGGTATC AACGTAATTA CTTAGCCGGT TGCGCGCCCC TCTCCGCCAA 



6601 TGCGTATTGG CGCTCTTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG 
ACGCATAACC GCGAGAAGGC GAAGGAGCGA GTGACTGAGC GACGCGAGCC 



6651 TCGTTCGGCT GCGGCGAGCG GTATCAGCTC AC T CAAAGG C GGTAATACGG 
AGCAAGCCGA CGCCGCTCGC CATAGTCGAG TGAGTTTCCG CCATTATGCC 
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1 CTGCAGCCTG AATATGGGCC AAACAGGATA TCTGTGGTAA GCAGTTCCTG 
GACGTCGGAC TTATACCCGG TTTGTCCTAT AGACACCATT CGTCAAGGAC 



51 


CCCCGGCTCA 
GGGGCCGAGT 


GGGCCAAGAA 
CCCGGTTCTT 


CAGATGGAAC 
GTCTACCTTG 


AGCTGAATAT 
TCGACTTATA 


GGGCCAAACA 
CCCGGTTTGT 


101 


GGATATCTGT 
CCTATAGACA 


GGTAAGCAGT 
CCATTCGTCA 


TCCTGCCCCG 
AGGACGGGGC 


GCTCAGGGCC 
CGAGTCCCGG 


AAGAACAGAT 
TTCTTGTCTA 


151 


GGTCCCCAGA 
CCAGGGGTCT 


TGCGGTCCAG 
ACGCCAGGTC 


CCCTCAGCAG 
GGGAGTCGTC 


TTTCTAGAGA 
AAAGATCTCT 


ACCATCAGAT 
TGGTAGTCTA 


201 


GTTTCCAGGG 
CAAAGGTCCC 


TGCCCCAAGG 
ACGGGGTTCC 


ACCTGAAATG 
TGGACTTTAC 


ACCCTGTGCC 
TGGGACACGG 


TTATTTGAAC 
AATAAACTTG 


251 


TAACCAATCA 
ATTGGTTAGT 


GTTCGCTTCT 
CAAGCGAAGA 


CGCTTCTGTT 
GCGAAGACAA 


CGCGCGCTTC TGCTCCCCGA 
GCGCGCGAAG ACGAGGGGCT 




GCT CAAT AAA 
CGAGTTATTT 


AGAGCCCACA 
TCTCGGGTGT 


ACCCCTCACT 
TGGGGAGTGA 


CGGGGCGCCA 
GCCCCGCGGT 


GTCCTCCGAT 
CAGGAGGCTA 


§Bi 


TGACTGAGTC 
ACTGACTCAG 


GCCCGGGTAC 
CGGGCCCATG 


CCGTGTATCC 
GGCACATAGG 


AATAAACCCT 
TTATTTGGGA 


CTTGCAGTTG 
GAACGTCAAC 


ill 


CATCCGACTT 
GTAGGCTGAA 


GTGGTCTCGC 
CACCAGAGCG 


TGTTCCTTGG 
ACAAGGAACC 


GAGGGTCTCC 
CTCCCAGAGG 


TCTGAGTGAT 
AGACTCACTA 



JUl TGACTACCCG TCAGCGGGGG TCTTTCATTT GGGGGCTCGT CCGGGATCGG 
% :t ACTGATGGGC AGTCGCCCCC AGAAAGTAAA CCCCCGAGCA GGCCCTAGCC 



|t)l GAGACCCCTG CCCAGGGACC ACCGACCCAC CACCGGGAGG CAAGCTGGCC 
K= CTCTGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCCTCC GTTCGACCGG 



l&l AGCAACT TAT CTGTGTCTGT CCGATTGTCT AGTGTCTATG ACTGATTTTA 
TCGTTGAATA GACACAGACA GGCTAACAGA TC AC AG AT AC TGACTAAAAT 



601 TGCGCCTGCG TCGGTACTAG TTAGCTAACT AGCTCTGTAT CTGGCGGACC 
ACGCGGACGC AGCCATGATC AATCGATTGA TCGAGACATA GACCGCCTGG 



651 CGTGGTGGAA CTGACGAGTT CTGAACACCC GGCCGCAACC CTGGGAGACG 
GCACCACCTT GACTGCTCAA GACTTGTGGG CCGGCGTTGG GACCCTCTGC 



701 TCCCAGGGAC TTTGGGGGCC GTTTTTGTGG CCCGACCTGA GGAAGGGAGT 
AGGGTCCCTG AAACCCCCGG CAAAAACACC GGGCTGGACT CCTTCCCTCA 



751 CGATGTGGAA TCCGACCCCG TCAGGATATG TGGTTCTGGT AGGAGACGAG 
GCTACACCTT AGGCTGGGGC AGTCCTATAC ACCAAGACCA TCCTCTGCTC 



801 AACCTAAAAC AGTTCCCGCC TCCGTCTGAA TTTTTGCTTT CGGTTTGGAA 
TTGGATTTTG TCAAGGGCGG AGGCAGACTT AAAAACGAAA GCCAAACCTT 



851 CCGAAGCCGC GCGTCTTGTC TGCTGCAGCA TCGTTCTGTG TTGTCTCTGT 
GGCTTCGGCG CGCAGAACAG ACGACGTCGT AGCAAGACAC AACAGAGACA 



901 CTGACTGTGT TTCTGTATTT GTCTGAAAAT TAGGGCCAGA CTGTTACCAC 
GACT GACACA AAGACATAAA CAGACTTTTA ATCCCGGTCT GACAATGGTG 
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951 TCCCTTAAGT TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC 
AGGGAATTCA AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG 



1001 ACAACCAGTC GGTAGATGTC AAGAAGAGAC GTTGGGTTAC CTTCTGCTCT 
TGTTGGTCAG C CAT C T AC AG TTCTTCTCTG CAACCCAATG GAAGACGAGA 



1051 GCAGAATGGC CAACCTTTAA CGTCGGATGG CCGCGAGACG GCACCTTTAA 
CGTCTTACCG GTTGGAAATT GCAGCCTACC GGCGCTCTGC CGTGGAAATT 



1101 CCGAGACCTC ATCACCCAGG TTAAGATCAA GGTCTTTTCA CCTGGCCCGC 
GGCTCTGGAG TAGTGGGTCC AATTCTAGTT CCAGAAAAGT GGACCGGGCG 



1151 ATGGACACCC AGACCAGGTC CCCTACATCG TGACCTGGGA AGCCTTGGCT 
TACCTGTGGG TCTGGTCCAG GGGATGTAGC ACTGGACCCT TCGGAACCGA 



1201 TTTGACCCCC CTCCCTGGGT CAAGCCCTTT GTACACCCTA AGCCTCCGCC 
AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG 



1251 TCCTCTTCCT CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA 
O AGGAGAAGGA GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT 



iS01 CCCCGCCTCG ATCCTCCCTT TATCCAGCCC TCACTCCTTC TCTAGGCGCC 
In GGGGCGGAGC TAGGAGGGAA ATAGGTCGGG AGTGAGGAAG AGATCCGCGG 



1351 GGCCGCTCTA GCCCATTAAT ACGACTCACT ATAGGGCGAT TCGAACACCA 
.11 CCGGCGAGAT CGGGTAATTA TGCTGAGTGA TATCCCGCTA AGCTTGTGGT 



14 01 T G C AC CAT C A TCATCATCAC GTCGACTATA AAGATGAGGA CCTCGAGATG 
L ACGTGGTAGT AGTAGTAGTG CAGCTGATAT TTCTACTCCT GGAGCTCTAC 



fisi GGCGTGATTA CGGATTCACT GGCCGTCGTG GCCCGCACCG ATCGCCCTTC 
O CCGCACTAAT GCCTAAGTGA CCGGCAGCAC CGGGCGTGGC TAGCGGGAAG 



ttloi CCAACAGTTA CGCAGCCTGA ATGGCGAATG GCGCTTTGCC TGGTTTCCGG 
f i GGTTGTCAAT GCGTCGGACT TACCGCTTAC CGCGAAACGG ACCAAAGGCC 



1551 CACCAGAAGC GGTGCCGGAA AGCTGGCTGG AGTGCGATCT TCCTGAGGCC 
GTGGTCTTCG CCACGGCCTT TCGACCGACC TCACGCTAGA AGG ACT CCGG 



1601 GATACTGTCG TCGTCCCCTC AAACTGGCAG ATGCACGGTT ACGATGCGCC 
CTATGACAGC AGCAGGGGAG TTTGACCGTC TACGTGCCAA TGCTACGCGG 



1651 CATCTACACC AACGTGACCT ATCCCATTAC GGTCAATCCG CCGTTTGTTC 
GTAGATGTGG TTGCACTGGA TAGGGTAATG CCAGTTAGGC GGCAAACAAG 



1701 CCACGGAGAA TCCGACGGGT TGTTACTCGC TCACATTTAA TGTTGATGAA 
GGTGCCTCTT AGGCTGCCCA ACAATGAGCG AGTGTAAATT AC AACT ACT T 



1751 AGCTGGCTAC AGGAAGGCCA GACGCGAATT ATTTTTGATG GCGTTAACTC 
TCGACCGATG TCCTTCCGGT CTGCGCTTAA TAAAAACTAC CGCAATTGAG 



1801 GGCGTTTCAT CTGTGGTGCA ACGGGCGCTG GGTCGGTTAC GGCCAGGACA 
CCGCAAAGTA GACACCACGT TGCCCGCGAC CCAGCCAATG CCGGTCCTGT 



1851 GTCGTTTGCC GTCTGAATTT GACCTGAGCG CATTTTTACG CGCCGGAGAA 
CAGCAAACGG CAGACTTAAA CTGGACTCGC GTAAAAATGC GCGGCCTCTT 
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1901 AACCGCCTCG CGGTGATGGT GCTGCGCTGG AGTGACGGCA GTTATCTGGA 
TTGGCGGAGC GCCACTACCA CGACGCGACC TCACTGCCGT CAATAGACCT 



1951 AGATCAGGAT ATGTGGCGGA TGAGCGGCAT TTTCCGTGAC GTCTCGTTGC 
TCTAGTCCTA TACACCGCCT ACTCGCCGTA AAAGGCACTG CAGAGCAACG 



2001 TGCATAAACC GACTACACAA ATCAGCGATT TCCATGTTGC CACTCGCTTT 
ACGTATTTGG CTGATGTGTT TAGTCGCTAA AGGTACAACG GTGAGCGAAA 



2051 AATGATGATT TCAGCCGCGC TGTACTGGAG GCTGAAGTTC AGATGTGCGG 
TTACTACTAA AGTCGGCGCG ACATGACCTC CGACTTCAAG TCTACACGCC 



2101 CGAGTTGCGT GACTACCTAC GGGTAACAGT TTCTTTATGG CAGGGTGAAA 
GCTCAACGCA CTGATGGATG CCCATTGTCA AAGAAATACC GTCCCACTTT 



2151 CGCAGGTCGC CAGCGGCACC GCGCCTTTCG GCGGTGAAAT TATCGATGAG 
GCGTCCAGCG GTCGCCGTGG CGCGGAAAGC CGCCACTTTA ATAGCTACTC 



2fe©l CGTGGTGGTT ATGCCGATCG CGTCACACTA CGTCTGAACG TCGAAAACCC 
\U GCACCACCAA TACGGCTAGC GCAGTGTGAT GCAGACTTGC AGCTTTTGGG 



2%%1 GAAACTGTGG AGCGCCGAAA TCCCGAATCT CTATCGTGCG GTGGTTGAAC 
; J: CTTTGACACC TCGCGGCTTT AGGGCTTAGA GATAGCACGC CACCAACTTG 



23f)l TGCACACCGC CGACGGCACG CTGATTGAAG CAGAAGCCTG CGATGTCGGT 
ACGTGTGGCG GCTGCCGTGC GACTAACTTC GTCTTCGGAC GCTACAGCCA 



^51 TTCCGCGAGG TGCGGATTGA AAATGGTCTG CTGCTGCTGA ACGGCAAGCC 
^ AAGGCGCTCC ACGCCTAACT TTTACCAGAC GACGACGACT TGCCGTTCGG 



Jft)l GTTGCTGATT CGAGGCGT T A ACCGTCACGA GCATCATCCT CTGCATGGTC 
M: CAACGACTAA GCTCCGCAAT TGGCAGTGCT CGTAGTAGGA GACGTACCAG 



2151 AGGTCATGGA TGAGCAGACG ATGGTGCAGG ATATCCTGCT GAT GAAGCAG 
TCCAGTACCT ACTCGTCTGC TACCACGTCC TATAGGACGA CTACTTCGTC 



2501 AACAACTTTA ACGCCGTGCG CTGTTCGCAT TATCCGAACC ATCCGCTGTG 
TTGTTGAAAT TGCGGCACGC GACAAGCGTA ATAGGCTTGG TAGGCGACAC 



2551 GTACACGCTG TGCGACCGCT ACGGCCTGTA TGTGGTGGAT GAAGCCAATA 
CAT GT G C G AC ACGCT GGCGA TGCCGGACAT ACACCACCTA CTTCGGTTAT 



2601 TTGAAACCCA CGGCATGGTG CCAATGAATC GTCTGACCGA TGATCCGCGC 
AACTTTGGGT GCCGTACCAC GGTTACTTAG CAGACTGGCT ACTAGGCGCG 



2651 TGGCTACCGG CGATGAGCGA ACGCGTAACG CGAATGGTGC AGCGCGATCG 
ACCGATGGCC GCTACTCGCT TGCGCATTGC GCTTACCACG TCGCGCTAGC 



27 01 TAATCACCCG AGTGTGATCA TCTGGTCGCT GGGGAATGAA TCAGGCCACG 
ATTAGTGGGC TCACACTAGT AGACCAGCGA CCCCTTACTT AGTCCGGTGC 



27 51 GCGCTAATCA CGACGCGCTG TATCGCTGGA TCAAATCTGT CGATCCTTCC 
CGCGATTAGT GCTGCGCGAC ATAGCGACCT AGTTTAGACA GCTAGGAAGG 



2801 CGCCCGGTGC AGTATGAAGG CGGCGGAGCC GACACCACGG CCACCGATAT 
GCGGGCCACG TCATACTTCC GCCGCCTCGG CTGTGGTGCC GGTGGCTATA 
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2851 TATTTGCCCG ATGTACGCGC GCGTGGATGA AGACCAGCCC TTCCCGGCTG 
ATAAACGGGC TACATGCGCG CGCACCTACT TCTGGTCGGG AAGGGCCGAC 



2901 


TGCCGAAATG 
ACGGCTTTAC 


GTCCATCAAA 
CAGGTAGTTT 


AAATGGCTTT 
TTTACCGAAA 


CGCTACCTGG 
GCGATGGACC 


AGAGACGCGC 
TCTCTGCGCG 


2951 


CCGCTGATCC 
GGCGACTAGG 


TTTGCGAATA 
AAACGCTTAT 


CGCCCACGCG 
GCGGGTGCGC 


ATGGGTAACA 
TACCCATTGT 


GTCTTGGCGG 
CAGAACCGCC 


3001 


TTTCGCTAAA 
AAAGCGATTT 


TACTGGCAGG 
ATGACCGTCC 


CGTTTCGTCA 
GCAAAGCAGT 


GTATCCCCGT 
CATAGGGGCA 


TTACAGGGCG 
AATGTCCCGC 


3051 


GCTTCGTCTG 
CGAAGCAGAC 


GGACTGGGTG 
CCTGACCCAC 


GATCAGTCGC 
CTAGTCAGCG 


TGATTAAATA 
ACTAATTTAT 


TGATGAAAAC 
ACTACTTTTG 


3101 


GGCAACCCGT 
CCGTTGGGCA 


GGTCGGCTTA 
CCAGCCGAAT 


CGGCGGTGAT 
GCCGCCACTA 


TTTGGCGATA 
AAACCGCTAT 


CGCCGAACGA 
GCGGCTTGCT 


3151 


TCGCCAGTTC 
AGCGGTCAAG 


TGTATGAACG 
ACATACTTGC 


GTCTGGTCTT 
CAGACCAGAA 


TGCCGACCGC 
ACGGCTGGCG 


ACGCCGCATC 
TGCGGCGTAG 


3§©1 


CAGCGCTGAC 
ptpp,ppaptp* 


GGAAGCAAAA 
PPTTPftTTTT 


CACCAGCAGC 
GTGGTCGTCG 


AGTTTTTCCA 
TCAAAAAGGT 


GTTCCGTTTA 
CAAGGCAAAT 


3|51 


TCCGGGCAAA 
AGGCCCGTTT 


CCATCGAAGT 
GGTAGCTTCA 


GACCAGCGAA 
CTGGTCGCTT 


TACCTGTTCC 
ATGGACAAGG 


GTCATAGCGA 
CAGTATCGCT 


3j01 


TAACGAGCTC 
ATTGCTCGAG 


CTGCACTGGA 
GACGTGACCT 


TGGTGGCGCT 
ACCACCGCGA 


GGATGGTAAG 
CCTACCATTC 


CCGCTGGCAA 
GGCGACCGTT 



ijpl GCGGTGAAGT GCCTCTGGAT GTCGCTCCAC AAGGTAAACA GTTGATTGAA 
O CGCCACTTCA CGGAGACCTA CAGCGAGGTG TTCCATTTGT CAACTAACTT 



IlOl CTGCCTGAAC TACCGCAGCC GGAGAGCGCC GGGCAACTCT GGCTCACAGT 
O GACGGACTTG ATGGCGTCGG CCTCTCGCGG CCCGTTGAGA CCGAGTGTCA 



3451 ACGCGTAGTG CAACCGAACG CGACCGCATG GTCAGAAGCC GGGCACATCA 
TGCGCATCAC GTTGGCTTGC GCTGGCGTAC CAGTCTTCGG CCCGTGTAGT 



3501 GCGCCTGGCA GCAGTGGCGT CTGGCGGAAA ACCTCAGTGT GACGCTCCCC 
CGCGGACCGT CGTCACCGCA GACCGCCTTT TGGAGTCACA CTGCGAGGGG 



3551 GCCGCGTCCC ACGCCATCCC GCATCTGACC ACCAGCGAAA TGGATTTTTG 
CGGCGCAGGG TGCGGTAGGG CGTAGACTGG TGGTCGCTTT ACCTAAAAAC 



3601 CATCGAGCTG GGTAATAAGC GTTGGCAATT TAACCGCCAG TCAGGCTTTC 
GTAGCTCGAC CCATTATTCG CAACCGTTAA ATTGGCGGTC AGTCCGAAAG 



3651 TTTCACAGAT GTGGATTGGC GATAAAAAAC AACTGCTGAC GCCGCTGCGC 
AAAGTGTCTA CACCTAACCG CTATTTTTTG TTGACGACTG CGGCGACGCG 



37 01 GAT C AG T T C A CCCGTGCACC GCTGGATAAC GACATTGGCG TAAGTGAAGC 
CTAGTCAAGT GGGCACGTGG CGACCTATTG CTGTAACCGC ATTCACTTCG 



3751 GACCCGCATT GACCCTAACG CCTGGGTCGA ACGCTGGAAG GCGGCGGGCC 
CTGGGCGTAA CTGGGATTGC GGACCCAGCT TGCGACCTTC CGCCGCCCGG 
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38 01 ATTACCAGGC CGAAGCAGCG TTGTTGCAGT GCACGGCAGA TACACTTGCT 
TAATGGTCCG GCTTCGTCGC AACAACGTCA CGTGCCGTCT ATGTGAACGA 



3851 GATGCGGTGC TGATTACGAC CGCTCACGCG TGGCAGCATC AGGGGAAAAC 
CTACGCCACG ACTAATGCTG GCGAGTGCGC ACCGTCGTAG TCCCCTTTTG 



3901 CTTATTTATC AGCCGGAAAA CCTACCGGAT TGATGGTAGT GGTCAAATGG 
GAATAAATAG TCGGCCTTTT GGATGGCCTA ACTACCATCA CCAGTTTACC 



3951 CGATTACCGT TGATGTTGAA GTGGCGAGCG ATACACCGCA TCCGGCGCGG 
GCTAATGGCA ACTACAACTT CACCGCTCGC TATGTGGCGT AGGCCGCGCC 



4001 ATTGGCCTGA ACTGCCAGCT GGCGCAGGTA GCAGAGCGGG TAAACTGGCT 
TAACCGGACT TGACGGTCGA CCGCGTCCAT CGTCTCGCCC ATTTGACCGA 



4 051 CGGATTAGGG CCGCAAGAAA ACTATCCCGA CCGCCTTACT GCCGCCTGTT 
GCCTAATCCC GGCGTTCTTT TGATAGGGCT GGCGGAATGA CGGCGGACAA 



4h4)l TTGACCGCTG GGATCTGCCA TTGTCAGACA TGTATACCCC GTACGTCTTC 
^3 AACTGGCGAC CCTAGACGGT AACAGTCTGT ACATATGGGG CATGCAGAAG 



lifel CCGAGCGAAA ACGGTCTGCG CTGCGGGACG CGCGAATTGA ATTATGGCCC 
GGCTCGCTTT TGCCAGACGC GACGCCCTGC GCGCTTAACT TAATACCGGG 



4.2€1 ACACCAGTGG CGCGGCGACT TCCAGTTCAA CATCAGCCGC TACAGTCAAC 
TGTGGTCACC GCGCCGCTGA AGGTCAAGTT GTAGTCGGCG ATGTCAGTTG 



4^51 AGCAACTGAT GGAAACCAGC CATCGCCATC TGCTGCACGC GGAAGAAGGC 
TCGTTGACTA CCTTTGGTCG GTAGCGGTAG ACGACGTGCG CCTTCTTCCG 



Hoi ACATGGCTGA ATATCGACGG TTTCCATATG GGGATTGGTG GCGACGACTC 
H TGTACCGACT TATAGCTGCC AAAGGTATAC CCCTAACCAC CGCTGCTGAG 



11 51 CTGGAGCCCG TCAGTATCGG CGGAATTCCA GCTGAGCGCC GGTCGCTACC 
GACCTCGGGC AGTCATAGCC GCCTTAAGGT CGACTCGCGG CCAGCGATGG 



4401 ATTACCAGTT GGTCTGGTGT CAAAAAAGAT CTGGAGGTGG TGGCAGCAGG 
TAATGGTCAA CCAGACCACA GTTTTTTCTA GACCTCCACC ACCGTCGTCC 



4 451 CCTTGGCGCG CCGGATCCTT AATTAACAAT TGACCGGTAA TAATAGGTAG 
GGAACCGCGC GGCCTAGGAA TTAATTGTTA ACTGGCCATT ATTATCCATC 



4 501 ATAAGTGACT GAT TAG AT G C ATTGATCCCT CGACCAATTC CGGTTATTTT 
TATTCACTGA CTAATCTACG TAACTAGGGA GCTGGTTAAG GCCAATAAAA 



4551 CCACCATATT GCCGTCTTTT GGCAATGTGA GGGCCCGGAA ACCTGGCCCT 
GGTGGTATAA CGGCAGAAAA CCGTTACACT CCCGGGCCTT TGGACCGGGA 



4 601 GTCTTCTTGA CGAGCATTCC TAGGGGTCTT TCCCCTCTCG CCAAAGGAAT 
CAGAAGAACT GCTCGTAAGG ATCCCCAGAA AGGGGAGAGC GGTTTCCTTA 



4651 GCAAGGTCTG TTGAATGTCG TGAAGGAAGC AGTTCCTCTG GAAGCTTCTT 
CGTTCCAGAC AACTTACAGC ACTTCCTTCG TCAAGGAGAC CTTCGAAGAA 



4701 GAAGACAAAC AACGTCTGTA GCGACCCTTT GCAGGCAGCG GAACCCCCCA 
CTTCTGTTTG TTGCAGACAT CGCTGGGAAA CGTCCGTCGC CTTGGGGGGT 
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4751 CCTGGCGACA GGTGCCTCTG CGGCCAAAAG CCACGTGTAT AAGATACACC 
GGACCGCTGT CCACGGAGAC GCCGGTTTTC GGTGCACATA TTCTATGTGG 



48 01 TGCAAAGGCG GCACAACCCC AGTGCCACGT TGTGAGTTGG ATAGTTGTGG 
ACGTTTCCGC CGTGTTGGGG TCACGGTGCA ACACTCAACC TATCAACACC 



4851 AAAGAGTCAA ATGGCTCTCC TCAAGCGTAT TCAACAAGGG GCTGAAGGAT 
TTTCTCAGTT TACCGAGAGG AGTTCGCATA AGTTGTTCCC CGACTTCCTA 



4 901 GCCCAGAAGG TACCCCATTG TATGGGATCT GATCTGGGGC CTCGGTGCAC 
CGGGTCTTCC ATGGGGTAAC ATACCCTAGA CTAGACCCCG GAGCCACGTG 



4951 ATGCTTTACA TGTGTTTAGT CGAGGTTAAA AAACGTCTAG GCCCCCCGAA 
TACGAAATGT ACACAAATCA GCTCCAATTT TTTGCAGATC CGGGGGGCTT 



5001 CCACGGGGAC GTGGTTTTCC TTTGAAAAAC ACGATGATAA TACCATGATT 
GGTGCCCCTG CACCAAAAGG AAACTTTTTG TGCTACTATT ATGGTACTAA 



5.0-51 GAACAAGATG GATTGCACGC AGGTTCTCCG GCCGCTTGGG TGGAGAGGCT 
iJ CTTGTTCTAC CTAACGTGCG TCCAAGAGGC CGGCGAACCC ACCTCTCCGA 



#01 ATTCGGCTAT GACTGGGCAC AACAGACAAT CGGCTGCTCT GATGCCGCCG 
ir ; TAAGCCGATA CTGACCCGTG TTGTCTGTTA GCCGACGAGA CTACGGCGGC 



5£51 TGTTCCGGCT GTCAGCGCAG GGGCGCCCGG TTCTTTTTGT CAAGACCGAC 
J-j ACAAGGCCGA CAGTCGCGTC CCCGCGGGCC AAGAAAAACA GTTCTGGCTG 



5201 CTGTCCGGTG CCCTGAATGA ACTGCAGGAC GAGGCAGCGC GGCTATCGTG 
1, GACAGGCCAC GGG ACT TACT TGACGTCCTG CTCCGTCGCG CCGATAGCAC 



5251 GCTGGCCACG ACGGGCGTTC CTTGCGCAGC TGTGCTCGAC GTTGTCACTG 
P CGACCGGTGC TGCCCGCAAG GAACGCGTCG ACACGAGCTG CAACAGTGAC 



&§01 AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGTGCCGGG GCAGGATCTC 
G TTCGCCCTTC CCTGACCGAC GATAACCCGC TTCACGGCCC CGTCCTAGAG 



5351 CTGTCATCTC ACCTTGCTCC TGCCGAGAAA GTATCCATCA TGGCTGATGC 
GACAGTAGAG TGGAACGAGG ACGGCTCTTT CATAGGTAGT ACCGACTACG 



5401 AATGCGGCGG CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC 
TTACGCCGCC GACGTATGCG AACTAGGCCG ATGGACGGGT AAGCTGGTGG 



5451 AAGCGAAACA TCGCATCGAG CGAGCACGTA CTCGGATGGA AGCCGGTCTT 
TTCGCTTTGT AGCGTAGCTC GCTCGTGCAT GAGCCTACCT TCGGCCAGAA 



5501 GTCGATCAGG ATGATCTGGA CGAAGAGCAT CAGGGGCTCG CGCCAGCCGA 
CAGCTAGTCC TACTAGACCT GCTTCTCGTA GTCCCCGAGC GCGGTCGGCT 



5551 ACTGTTCGCC AGGCTCAAGG CGCGCATGCC CGACGGCGAG GATCTCGTCG 
TGACAAGCGG TCCGAGTTCC GCGCGTACGG GCTGCCGCTC C TAG AG C AGC 



5601 TGACCCATGG CGATGCCTGC TTGCCGAATA TCATGGTGGA AAATGGCCGC 
ACTGGGTACC GCTACGGACG AACGGCTTAT AGTACCACCT TTTACCGGCG 



5651 TTTTCTGGAT TCATCGACTG TGGCCGGCTG GGTGTGGCGG ACCGCTATCA 
AAAAGACCTA AGTAGCTGAC ACCGGCCGAC CCACACCGCC TGGCGATAGT 
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5701 


GGACATAGCG 
CCTGTATCGC 


TTGGCTACCC 
AACCGATGGG 


GTGATATTGC 
CACTATAACG 


TGAAGAGCTT 
ACTTCTCGAA 


GGCGGCGAAT 
CCGCCGCTTA 


5751 


GGGCTGACCG 
CCCGACTGGC 


CTTCCTCGTG 
GAAGGAGCAC 


CTTTACGGTA 
GAAATGCCAT 


TCGCCGCTCC 
AGCGGCGAGG 


CGATTCGCAG 
GCTAAGCGTC 


5801 


CGCATCGCCT 
GCGTAGCGGA 


TCTATCGCCT 
AGATAGCGGA 


TCTTGACGAG 
AGAACTGCTC 


TTCTTCTGAG 
AAGAAGACTC 


CGGGACTCTG 
GCCCTGAGAC 


5851 


GGGTTCGCAT 
CCCAAGCGTA 


CGATAAAATA 
GCTATTTTAT 


AAAGATTTTA 
TTTCTAAAAT 


TTTAGTCTCC 
AAATCAGAGG 


AGAAAAAGGG 
TCTTTTTCCC 


5901 


GGGAATGAAA 
CCCTTACTTT 


GACCCCACCT 
CTGGGGTGGA 


GTAGGTTTGG 
CATCCAAACC 


CAAGCTAGCT 
GTTCGATCGA 


TAAGTAACGC 
ATTCATTGCG 


5951 


CATTTTGCAA 
GTAAAACGTT 


GGCATGGAAA 
CCGTACCTTT 


AATACATAAC 
TTATGTATTG 


TGAGAATAGA 
ACTCTTATCT 


GAAGTTCAGA 
CTTCAAGTCT 


6ill 


TCAAGGTCAG 
AGTTCCAGTC 


GAACAGATGG 
CTTGTCTACC 


AACAGCTGAA 
TTGTCGACTT 


TATGGGCCAA 
ATACCCGGTT 


ACAGGATATC 
TGTCCTATAG 




TGTGGTAAGC 
ACACCATTCG 


AGTTCCTGCC 
TCAAGGACGG 


CCGGCTCAGG 
GGCCGAGTCC 


GCCAAGAACA 
CGGTTCTTGT 


GATGGAACAG 
CTACCTTGTC 


6ipl 


CTGAATATGG 
GACTTATACC 


GCCAAACAGG 
CGGTTTGTCC 


ATATCTGTGG 
TATAGACACC 


TAAGCAGTTC 
ATTCGTCAAG 


CTGCCCCGGC 
GACGGGGCCG 


SL51 


TCAGGGCCAA 
AGTCCCGGTT 


GAACAGATGG 
CTTGTCTACC 


TCCCCAGATG 
AGGGGTCTAC 


CGGTCCAGCC 
GCCAGGTCGG 


CTCAGCAGTT 
GAGTCGTCAA 


lioi 


TCTAGAGAAC 
AGATCTCTTG 


CATCAGATGT 
GTAGTCTACA 


TTCCAGGGTG 
AAGGTCCCAC 


CCCCAAGGAC 
GGGGTTCCTG 


CTGAAATGAC 
GACTTTACTG 



051 CCTGTGCCTT ATTTGAACTA ACCAATCAGT TCGCTTCTCG CTTCTGTTCG 
GGACACGGAA TAAACTTGAT TGGTTAGTCA AGCGAAGAGC GAAGACAAGC 



6301 CGCGCTTCTG CTCCCCGAGC TCAATAAAAG AGCCCACAAC CCCTCACTCG 
GCGCGAAGAC GAGGGGCTCG AGTTATTTTC TCGGGTGTTG GGGAGTGAGC 



6351 GGGCGCCAGT CCTCCGATTG ACTGAGTCGC CCGGGTACCC GTGTATCCAA 
CCCGCGGTCA GGAGGCTAAC TGACTCAGCG GGCCCATGGG CACATAGGTT 



6401 TAAACCCTCT TGCAGTTGCA TCCGACTTGT GGTCTCGCTG TTCCTTGGGA 
ATTTGGGAGA ACGTCAACGT AGGCTGAACA CCAGAGCGAC AAGGAACCCT 



6451 GGGTCTCCTC TGAGTGATTG ACTACCCGTC AGCGGGGGTC TTTCATTCAT 
CCCAGAGGAG ACTCACTAAC TGATGGGCAG TCGCCCCCAG AAAGTAAGTA 



6501 GCAGCATGTA TCAAAATTAA TTTGGTTTTT TTTCTTAAGT ATTTACATTA 
CGTCGTACAT AGTTTTAATT AAACCAAAAA AAAGAATTCA TAAATGTAAT 



6551 AATGGCCATA GTTGCATTAA TGAATCGGCC AACGCGCGGG GAGAGGCGGT 
TTACCGGTAT CAACGTAATT ACTTAGCCGG TTGCGCGCCC CTCTCCGCCA 



6601 TTGCGTATTG GCGCTCTTCC GCTTCCTCGC TCACTGACTC GCTGCGCTCG 
AACGCATAAC CGCGAGAAGG CGAAGGAGCG AGTGACTGAG CGACGCGAGC 
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6651 GTCGTTCGGC TGCGGCGAGC GGTATCAGCT CACTCAAAGG CGGTAATACG 
CAGCAAGCCG ACGCCGCTCG CCATAGTCGA GTGAGTTTCC GCCATTATGC 



6701 


GTTATCCACA 
CAATAGGTGT 


GAATCAGGGG 
CTTAGTCCCC 


ATAACGCAGG 
TATTGCGTCC 


AAAGAACATG 
TTTCTTGTAC 


TGAGCAAAAG 
ACTCGTTTTC 


6751 


GCCAGCAAAA 
CGGTCGTTTT 


GGCCAGGAAC 
CCGGTCCTTG 


CGTAAAAAGG 
GCATTTTTCC 


CCGCGTTGCT 
GGCGCAACGA 


GGCGTTTTTC 
CCGCAAAAAG 


6801 


CATAGGCTCC 
GTATCCGAGG 


GCCCCCCTGA 
CGGGGGGACT 


CGAGCATCAC 
GCTCGTAGTG 


AAAAATCGAC 
TTTTTAGCTG 


GCTCAAGTCA 
CGAGTTCAGT 


6851 


GAGGTGGCGA 
CTCCACCGCT 


AACCCGACAG 
TTGGGCTGTC 


GACTATAAAG 
CTGATATTTC 


ATACCAGGCG 
TATGGTCCGC 


TTTCCCCCTG 
AAAGGGGGAC 


6901 


GAAGCTCCCT 
CTTCGAGGGA 


CGTGCGCTCT 
GCACGCGAGA 


CCTGTTCCGA 
GGACAAGGCT 


CCCTGCCGCT 
GGGACGGCGA 


TACCGGATAC 
ATGGCCTATG 


6951 


CTGTCCGCCT 
GACAGGCGGA 


TTCTCCCTTC 
AAGAGGGAAG 


GGGAAGCGTG 
CCCTTCGCAC 


GCGCTTTCTC 
CGCGAAAGAG 


ATAGCTCACG 
TATCGAGTGC 


poi 

y i 


CTGTAGGTAT 
GACATCCATA 


CTCAGTTCGG 
GAGTCAAGCC 


TGTAGGTCGT 
ACATCCAGCA 


TCGCTCCAAG 
AGCGAGGTTC 


CTGGGCTGTG 
GACCCGACAC 


7=051 


TGCACGAACC 
ACGTGCTTGG 


CCCCGTTCAG 
GGGGCAAGTC 


CCCGACCGCT 
GGGCTGGCGA 


GCGCCTTATC 
CGCGGAATAG 


CGGTAACTAT 
GCCATTGATA 


Vioi 


CGTCTTGAGT 
GCAGAACTCA 


CCAACCCGGT 
GGTTGGGCCA 


AAGACACGAC 
TTCTGTGCTG 


TTATCGCCAC 
AATAGCGGTG 


TGGCAGCAGC 
ACCGTCGTCG 


^151 


CACTGGTAAC 
GTGACCATTG 


AGGATTAGCA 
TCCTAATCGT 


GAGCGAGGTA 
CTCGCTCCAT 


TGTAGGCGGT 
ACATCCGCCA 


GCTACAGAGT 
CGATGTCTCA 


|7?01 


TCTTGAAGTG 
AGAACTTCAC 


GTGGCCTAAC 
CACCGGATTG 


TACGGCTACA 
ATGCCGATGT 


CTAGAAGAAC 
GATCTTCTTG 


AGTATTTGGT 
TCATAAACCA 



7251 ATCTGCGCTC TGCTGAAGCC AGTTACCTTC GGAAAAAGAG TTGGTAGCTC 
TAGACGCGAG ACGACTTCGG TCAATGGAAG CCTTTTTCTC AACCATCGAG 



7 301 TTGATCCGGC AAACAAACCA CCGCTGGTAG CGGTGGTTTT TTTGTTTGCA 
AACTAGGCCG TTTGTTTGGT GGCGACCATC GCCACCAAAA AAACAAACGT 



7 351 AGCAGCAGAT TACGCGCAGA AAAAAAGGAT CTCAAGAAGA TCCTTTGATC 
TCGTCGTCTA ATGCGCGTCT TTTTTTCCTA GAGTTCTTCT AGGAAACTAG 



7 401 TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT 
AAAAGATGCC CCAGACTGCG AGTCACCTTG CTTTTGAGTG CAATTCCCTA 



7 4 51 TTTGGTCATG AG AT TAT C AA AAAGGATCTT CACCTAGATC CTTTTGCGGC 
AAACCAGTAC TCTAATAGTT TTTCCTAGAA GTGGATCTAG GAAAACGCCG 



7501 CGCAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTACCA 
GCGTTTAGTT AGATTTCATA TATACTCATT TGAACCAGAC TGTCAATGGT 



7551 ATGCTTAATC AGTGAGGCAC CTATCTCAGC GATCTGTCTA TTTCGTTCAT 
TACGAATTAG TCACTCCGTG GATAGAGTCG CTAGACAGAT AAAGCAAGTA 
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7601 CCATAGTTGC CTGACTCCCC GTCGTGTAGA TAACTACGAT ACGGGAGGGC 
GGTATCAACG GACTGAGGGG CAGCACATCT ATTGATGCTA TGCCCTCCCG 



7 651 TTACCATCTG GCCCCAGTGC TGCAATGATA CCGCGAGACC CACGCTCACC 
AATGGTAGAC CGGGGTCACG ACGTTACTAT GGCGCTCTGG GTGCGAGTGG 



77 01 GGCTCCAGAT TTATCAGCAA TAAACCAGCC AGCCGGAAGG GCCGAGCGCA 
CCGAGGTCTA AATAGTCGTT ATTTGGTCGG TCGGCCTTCC CGGCTCGCGT 



7751 GAAGTGGTCC TGCAACTTTA TCCGCCTCCA TCCAGTCTAT TAATTGTTGC 
CTTCACCAGG ACGTTGAAAT AGGCGGAGGT AGGTCAGATA ATTAACAACG 



7801 CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT AATAGTTTGC GCAACGTTGT 
GCCCTTCGAT CTCATTCATC AAGCGGTCAA TTATCAAACG CGTTGCAACA 



7851 TGCCATTGCT ACAGGCATCG TGGTGTCACG CTCGTCGTTT GGTATGGCTT 
ACGGTAACGA TGTCCGTAGC ACCACAGTGC GAGCAGCAAA CCATACCGAA 



7:901 CATTCAGCTC CGGTTCCCAA CGATCAAGGC GAGTTACATG ATCCCCCATG 
% :t GTAAGTCGAG GCCAAGGGTT GCTAGTTCCG CTCAATGTAC TAGGGGGTAC 



y?51 TTGTGCAAAA AAGCGGTTAG CTCCTTCGGT CCTCCGATCG TTGTCAGAAG 
ill AACACGTTTT TTCGCCAATC GAGGAAGCCA GGAGGCTAGC AACAGTCTTC 



4©0l TAAGTTGGCC GCAGTGTTAT CACTCATGGT TATGGCAGCA CTGCATAATT 
-Jl ATTCAACCGG CGTCACAATA GTGAGTACCA ATACCGTCGT GACGTATTAA 



£051 CTCTTACTGT CATGCCATCC GTAAGATGCT TTTCTGTGAC TGGTGAGTAC 
GAGAATGACA GTACGGTAGG CATTCTACGA AAAGACACTG ACCACTCATG 



=3101 TCAACCAAGT CAT T C T GAGA AT AGTGT AT G CGGCGACCGA GTTGCTCTTG 
AGTTGGTTCA GTAAGACTCT TAT C AC AT AC GCCGCTGGCT CAACGAGAAC 



*«.51 CCCGGCGTCA ATACGGGATA ATACCGCGCC ACATAGCAGA ACTTTAAAAG 
O GGGCCGCAGT TATGCCCTAT TATGGCGCGG TGTATCGTCT TGAAATTTTC 



8201 TGCTCATCAT TGGAAAACGT TCTTCGGGGC GAAAACTCTC AAGGATCTTA 
ACGAGTAGTA ACCTTTTGCA AGAAGCCCCG CTTTTGAGAG TTCCTAGAAT 



8251 CCGCTGTTGA GATCCAGTTC GATGTAACCC ACTCGTGCAC CCAACTGATC 
GGCGACAACT CTAGGTCAAG CTACATTGGG TGAGCACGTG GGTTGACTAG 



8301 TTCAGCATCT TTTACTTTCA CCAGCGTTTC TGGGTGAGCA AAAACAGGAA 
AAGTCGTAGA AAATGAAAGT GGTCGCAAAG ACCCACTCGT TTTTGTCCTT 



8351 GGCAAAATGC CGCAAAAAAG GGAATAAGGG CGACACGGAA ATGTTGAATA 
CCGTTTTACG GCGTTTTTTC CCTTATTCCC GCTGTGCCTT TACAACTTAT 



84 01 CTCATACTCT TCCTTTTTCA ATATTATTGA AGCATTTATC AGGGTTATTG 
GAGTATGAGA AGGAAAAAGT TATAATAACT TCGTAAATAG TCCCAATAAC 



84 51 TCTCATGAGC GGATACATAT TTGAATGTAT T T AG AAAAAT AAACAAATAG 
AGAGTACTCG CCTATGTATA AACTTACATA AATCTTTTTA TTTGTTTATC 



8501 GGGTTCCGCG CACATTTC 
CCCAAGGCGC GTGTAAAG 
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1 CTGCAGCCTG AATATGGGCC AAACAGGATA TCTGTGGTAA GCAGTTCCTG 
GACGTCGGAC TTATACCCGG TTTGTCCTAT AGACACCATT CGTCAAGGAC 



51 


CCCCGGCTCA 
GGGGCCGAGT 


GGGCCAAGAA 
CCCGGTTCTT 


CAGATGGAAC 
GTCTACCTTG 


AGCTGAATAT 
TCGACTTATA 


GGGCCAAACA 
CCCGGTTTGT 


101 


GGATATCTGT 
CCTATAGACA 


GGTAAGCAGT 
CCATTCGTCA 


TCCTGCCCCG 
AGGACGGGGC 


GCTCAGGGCC 
CGAGTCCCGG 


AAGAACAGAT 
TTCTTGTCTA 


151 


GGTCCCCAGA 
CCAGGGGTCT 


TGCGGTCCAG 
ACGCCAGGTC 


CCCTCAGCAG 
GGGAGTCGTC 


TTTCTAGAGA 
AAAGATCTCT 


ACCATCAGAT 
TGGTAGTCTA 


201 


GTTTCCAGGG 
CAAAGGTCCC 


TGCCCCAAGG 
ACGGGGTTCC 


ACCTGAAATG 
TGGACTTTAC 


ACCCTGTGCC 
TGGGACACGG 


TTATTTGAAC 
AATAAACTTG 


251 


TAACCAATCA 
ATTGGTTAGT 


GTTCGCTTCT 
CAAGCGAAGA 


CGCTTCTGTT 
GCGAAGACAA 


CGCGCGCTTC 
GCGCGCGAAG 


TGCTCCCCGA 
ACGAGGGGCT 


ybi 


GCTCAATAAA 
CGAGTTATTT 


AGAGCCCACA 
TCTCGGGTGT 


ACCCCTCACT 
TGGGGAGTGA 


CGGGGCGCCA 
GCCCCGCGGT 


GTCCTCCGAT 
CAGGAGGCTA 


ii5i 


TGACTGAGTC 
ACTGACTCAG 


GCCCGGGTAC 
CGGGCCCATG 


CCGTGTATCC 
GGCACATAGG 


AATAAACCCT 
TTATTTGGGA 


CTTGCAGTTG 
GAACGTCAAC 


•j§01 


CATCCGACTT 
GTAGGCTGAA 


GTGGTCTCGC 
CACCAGAGCG 


TGTTCCTTGG 
ACAAGGAACC 


GAGGGTCTCC 
CTCCCAGAGG 


TCTGAGTGAT 
AGACTCACTA 


J451 


TGACTACCCG 
ACTGATGGGC 


TCAGCGGGGG 
AGTCGCCCCC 


TCTTTCATTT 
AGAAAGTAAA 


GGGGGCTCGT 
CCCCCGAGCA 


CCGGGATCGG 
GGCCCTAGCC 



Qoi GAGACCCCTG CCCAGGGACC ACCGACCCAC CACCGGGAGG CAAGCTGGCC 
'h h CTCTGGGGAC GGGTCCCTGG TGGCTGGGTG GTGGCCCTCC GTTCGACCGG 



551 


AGCAACTTAT 
TCGTTGAATA 


CTGTGTCTGT 
GACACAGACA 


CCGATTGTCT 
GGCTAACAGA 


AGTGTCTATG 
TCACAGATAC 


ACTGATTTTA 
TGACTAAAAT 


601 


TGCGCCTGCG 
ACGCGGACGC 


TCGGTACTAG 
AGCCATGATC 


TTAGCTAACT 
AATCGATTGA 


AGCTCTGTAT 
TCGAGACATA 


CTGGCGGACC 
GACCGCCTGG 


651 


CGTGGTGGAA 
GCACCACCTT 


CTGACGAGTT 
GACTGCTCAA 


CTGAACACCC 
GACTTGTGGG 


GGCCGCAACC 
CCGGCGTTGG 


CTGGGAGACG 
GACCCTCTGC 


701 


TCCCAGGGAC 
AGGGTCCCTG 


TTTGGGGGCC 
AAACCCCCGG 


GTTTTTGTGG 
CAAAAACACC 


CCCGACCTGA 
GGGCTGGACT 


GGAAGGGAGT 
CCTTCCCTCA 


751 


CGATGTGGAA 
GCTACACCTT 


TCCGACCCCG 
AGGCTGGGGC 


T C AGG AT AT G 
AGTCCTATAC 


TGGTTCTGGT 
ACCAAGACCA 


AGGAGACGAG 
TCCTCTGCTC 


801 


AACCTAAAAC 
TTGGATTTTG 


AGTTCCCGCC 
TCAAGGGCGG 


TCCGTCTGAA 
AGGCAGACTT 


TTTTTGCTTT 
AAAAACGAAA 


CGGTTTGGAA 
GCCAAACCTT 


851 


CCGAAGCCGC 
GGCTTCGGCG 


GCGTCTTGTC 
CGCAGAACAG 


TGCTGCAGCA 
ACGACGTCGT 


TCGTTCTGTG 
AGCAAGACAC 


TTGTCTCTGT 
AACAGAGACA 


901 


CTGACTGTGT 
GACTGACACA 


TTCTGTATTT 
AAGACATAAA 


GTCTGAAAAT 
CAGACTTTTA 


TAGGGCCAGA 
ATCCCGGTCT 


CTGTTACCAC 
GACAATGGTG 
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951 TCCCTTAAGT TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC 
AGGGAATTCA AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG 



1001 ACAACCAGTC GGTAGATGTC AAGAAGAGAC GTTGGGTTAC CTTCTGCTCT 
TGTTGGTCAG CCATCTACAG TTCTTCTCTG CAACCCAATG GAAGACGAGA 



1051 GCAGAATGGC CAACCTTTAA CGTCGGATGG CCGCGAGACG GCACCTTTAA 
CGTCTTACCG GTTGGAAATT GCAGCCTACC GGCGCTCTGC CGTGGAAATT 



1101 CCGAGACCTC ATCACCCAGG TTAAGATCAA GGTCTTTTCA CCTGGCCCGC 
GGCTCTGGAG TAGTGGGTCC AATTCTAGTT CCAGAAAAGT GGACCGGGCG 



1151 ATGGACACCC AGACCAGGTC CCCTACATCG TGACCTGGGA AGCCTTGGCT 
TACCTGTGGG TCTGGTCCAG GGGATGTAGC ACTGGACCCT TCGGAACCGA 



1201 TTTGACCCCC CTCCCTGGGT CAAGCCCTTT GTACACCCTA AGCCTCCGCC 
AAACTGGGGG GAGGGACCCA GTTCGGGAAA CATGTGGGAT TCGGAGGCGG 



1251 TCCTCTTCCT CCATCCGCCC CGTCTCTCCC CCTTGAACCT CCTCGTTCGA 
□ AGGAGAAGGA GGTAGGCGGG GCAGAGAGGG GGAACTTGGA GGAGCAAGCT 



iioi CCCCGCCTCG ATCCTCCCTT TATCCAGCCC TCACTCCTTC TCTAGGCGCC 
|Jl GGGGCGGAGC TAGGAGGGAA ATAGGTCGGG AGTGAGGAAG AGATCCGCGG 



13:51 GGCCGCTCTA GCCCATTAAT ACGACTCACT ATAGGGCGAT TCGAATCAGG 
\\i CCGGCGAGAT CGGGTAATTA TGCTGAGTGA TATCCCGCTA AGCTTAGTCC 



1401 CCTTGGCGCG CCGGATCCTT AATTAAGCGC AATTGGGAGG TGGCGGTAGC 
L GGAACCGCGC GGCCTAGGAA TTAATTCGCG TTAACCCTCC ACCGCCATCG 



iME51 CTCGAGATGG GCGTGATTAC GGATTCACTG GCCGTCGTTT TACAACGTCG 
GAGCTCTACC CGCACTAATG CCTAAGTGAC CGGCAGCAAA ATGTTGCAGC 



CI 01 TGACTGGGAA AACCCTGGCG TTACCCAACT TAATCGCCTT GCAGCACATC 
Fl ACTGACCCTT TTGGGACCGC AATGGGTTGA ATTAGCGGAA CGTCGTGTAG 



1551 CCCCTTTCGC CAGCTGGCGT AATAGCGAAG AGGCCCGCAC CGATCGCCCT 
GGGGAAAGCG GTCGACCGCA TTATCGCTTC TCCGGGCGTG GCTAGCGGGA 



1601 TCCCAACAGT TACGCAGCCT GAATGGCGAA TGGCGCTTTG CCTGGTTTCC 
AGGGTTGTCA ATGCGTCGGA CTTACCGCTT ACCGCGAAAC GGACCAAAGG 



1651 GGCACCAGAA GCGGTGCCGG AAAGCTGGCT GGAGTGCGAT CTTCCTGAGG 
CCGTGGTCTT CGCCACGGCC TTTCGACCGA CCTCACGCTA GAAGGACTCC 



17 01 CCGATACTGT CGTCGTCCCC TCAAACTGGC AGATGCACGG TTACGATGCG 
GGCTATGACA GCAGCAGGGG AGTTTGACCG TCTACGTGCC AATGCTACGC 



1751 CCCATCTACA CCAACGTGAC CTATCCCATT ACGGTCAATC CGCCGTTTGT 
GGGTAGATGT GGTTGCACTG GATAGGGTAA TGCCAGTTAG GCGGCAAACA 



1801 TCCCACGGAG AATCCGACGG GTTGTTACTC GCTCACATTT AATGTTGATG 
AGGGTGCCTC TTAGGCTGCC CAACAATGAG CGAGTGTAAA TTACAACTAC 



1851 AAAGCTGGCT ACAGGAAGGC CAGACGCGAA TTATTTTTGA TGGCGTTAAC 
TTTCGACCGA TGTCCTTCCG GTCTGCGCTT AATAAAAACT ACCGCAATTG 



pi CAST OMC 



Page 3 



1901 TCGGCGTTTC ATCTGTGGTG CAACGGGCGC TGGGTCGGTT ACGGCCAGGA 
AGCCGCAAAG TAGACACCAC GTTGCCCGCG ACCCAGCCAA TGCCGGTCCT 



1951 


CAGTCGTTTG 
GTCAGCAAAC 


CCGTCTGAAT 
GGCAGACTTA 


TTGACCTGAG 
AACTGGACTC 


CGCATTTTTA 
GCGTAAAAAT 


CGCGCCGGAG 
GCGCGGCCTC 


2001 


AAAACCGCCT 
TTTTGGCGGA 


CGCGGTGATG 
GCGCCACTAC 


GTGCTGCGCT 
CACGACGCGA 


GGAGTGACGG 
CCTCACTGCC 


CAGTTATCTG 
GTCAATAGAC 


2051 


GAAGATCAGG 
CTTCTAGTCC 


ATATGTGGCG 
TATACACCGC 


GATGAGCGGC 
CTACTCGCCG 


ATTTTCCGTG 
TAAAAGGCAC 


ACGTCTCGTT 
TGCAGAGCAA 


2101 


GCTGCATAAA 
CGACGTATTT 


CCGACTACAC 
GGCTGATGTG 


AAATCAGCGA 
TTTAGTCGCT 


TTTCCATGTT 
AAAGGTACAA 


GCCACTCGCT 
CGGTGAGCGA 


2151 


TTAATGATGA 
AATTACTACT 


TTTCAGCCGC 
AAAGTCGGCG 


GCTGTACTGG 
CGACATGACC 


AGGCTGAAGT 
TCCGACTTCA 


TCAGATGTGC 
AGTCTACACG 


2fii 


GGCGAGTTGC 
CCGCTCAACG 


GTGACTACCT 
CACTGATGGA 


ACGGGTAACA 
TGCCCATTGT 


GTTTCTTTAT 
CAAAGAAATA 


GGCAGGGTGA 
CCGTCCCACT 


2g|l 


AACGCAGGTC 
TTGCGTCCAG 


GCCAGCGGCA 
CGGTCGCCGT 


CCGCGCCTTT 
GGCGCGGAAA 


CGGCGGTGAA 
GCCGCCACTT 


ATTATCGATG 
TAATAGCTAC 




AGCGTGGTGG 
TCGCACCACC 


TTATGCCGAT 
AATACGGCTA 


CGCGTCACAC 
GCGCAGTGTG 


TACGTCTGAA 
ATGCAGACTT 


CGTCGAAAAC 
GCAGCTTTTG 


2^1 


CCGAAACTGT 
GGCTTTGACA 


GGAGCGCCGA 
CCTCGCGGCT 


AATCCCGAAT 
TTAGGGCTTA 


CTCTATCGTG 
GAGA1 AGO AC 


CGGTGGTTGA 


sftbi 


ACTGCACACC 
TGACGTGTGG 


GCCGACGGCA 
CGGCTGCCGT 


CGCTGATTGA 
GCGACTAACT 


AGCAGAAGCC 
TCGTCTTCGG 


TGCGATGTCG 
ACGCTACAGC 


SP 1 


GTTTCCGCGA 
CAAAGGCGCT 


GGTGCGGATT 
CCACGCCTAA 


GAAAATGGTC 
CTTTTACCAG 


TGCTGCTGCT 
ACGACGACGA 


GAACGGCAAG 
CTTGCCGTTC 



2501 CCGTTGCTGA TTCGAGGCGT TAACCGTCAC GAGCATCATC CTCTGCATGG 
GGCAACGACT AAGCTCCGCA ATTGGCAGTG CTCGTAGTAG GAGACGTACC 



2551 TCAGGTCATG GATGAGCAGA CGATGGTGCA GGATATCCTG CTGATGAAGC 
AGTCCAGTAC CTACTCGTCT GCTACCACGT CCTATAGGAC GACTACTTCG 



2 601 AGAACAACTT TAACGCCGTG CGCTGTTCGC ATTATCCGAA CCATCCGCTG 
TCTTGTTGAA ATTGCGGCAC GCGACAAGCG TAATAGGCTT GGTAGGCGAC 



2 651 TGGTACACGC TGTGCGACCG CTACGGCCTG TATGTGGTGG AT GAAGCCAA 
ACCATGTGCG ACACGCTGGC GATGCCGGAC ATACACCACC TACTTCGGTT 



2701 TATTGAAACC CACGGCATGG TGCCAATGAA TCGTCTGACC GATGATCCGC 
ATAACTTTGG GTGCCGTACC ACGGTTACTT AGCAGACTGG CTACTAGGCG 



2751 GCTGGCTACC GGCGATGAGC GAACGCGTAA CGCGAATGGT GCAGCGCGAT 
CGACCGATGG CCGCTACTCG CTTGCGCATT GCGCTTACCA CGTCGCGCTA 



2801 CGTAATCACC CGAGTGTGAT CATCTGGTCG CTGGGGAATG AATCAGGCCA 
GCATTAGTGG GCTCACACTA GTAGACCAGC GACCCCTTAC TTAGTCCGGT 
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2851 CGGCGCTAAT CACGACGCGC TGTATCGCTG GATCAAATCT GTCGATCCTT 
GCCGCGATTA GTGCTGCGCG ACATAGCGAC CTAGTTTAGA CAGCTAGGAA 



2901 


CCCGCCCGGT 
GGGCGGGCCA 


GCAGTATGAA 
CGTCATACTT 


GGCGGCGGAG 
CCGCCGCCTC 


CCGACACCAC 
GGCTGTGGTG 


GGCCACCGAT 
CCGGTGGCTA 


2951 


ATTATTTGCC 
TAATAAACGG 


CGATGTACGC 
GCTACATGCG 


GCGCGTGGAT 
CGCGCACCTA 


GAAGACCAGC 
CTTCTGGTCG 


CCTTCCCGGC 
GGAAGGGCCG 


3001 


TGTGCCGAAA 
ACACGGCTTT 


TGGTCCATCA 
ACCAGGTAGT 


AAAAATGGCT 
TTTTTACCGA 


TTCGCTACCT 
AAGCGATGGA 


GGAGAGACGC 
CCTCTCTGCG 


3051 


GCCCGCTGAT 
CGGGCGACTA 


CCTTTGCGAA 
GGAAACGCTT 


TACGCCCACG 
ATGCGGGTGC 


CGATGGGTAA 
GCTACCCATT 


CAGTCTTGGC 
GTCAGAACCG 


3101 


GGTTTCGCTA 
CCAAAGCGAT 


AATACTGGCA 
TTATGACCGT 


GGCGTTTCGT 
CCGCAAAGCA 


CAGTATCCCC 
GTCATAGGGG 


GTTTACAGGG 
CAAATGTCCC 


3151 


CGGCTTCGTC 
GCCGAAGCAG 


TGGGACTGGG 
ACCCTGACCC 


TGGATCAGTC 
ACCTAGTCAG 


GCTGATTAAA 
CGACTAATTT 


TATGATGAAA 
ATACTACTTT 


|^01 


ACGGCAACCC 
TGCCGTTGGG 


GTGGTCGGCT 
CACCAGCCGA 


TACGGCGGTG 
ATGCCGCCAC 


ATTTTGGCGA 
TAAAACCGCT 


TACGCCGAAC 
ATGCGGCTTG 


3£51 


GATCGCCAGT 
CTAGCGGTCA 


TCTGTATGAA 
AGACATACTT 


CGGTCTGGTC 
GCCAGACCAG 


TTTGCCGACC 
AAACGGCTGG 


GCACGCCGCA 
CGTGCGGCGT 


3301 


TCCAGCGCTG 
AGGTCGCGAC 


ACGGAAGCAA 
TGCCTTCGTT 


AACACCAGCA 
TTGTGGTCGT 


GCAGTTTTTC 
CGTCAAAAAG 


CAGTTCCGTT 
GTCAAGGCAA 


§351 


TATCCGGGCA 
ATAGGCCCGT 


AACCATCGAA 
TTGGTAGCTT 


GTGACCAGCG 
CACTGGTCGC 


AATACCTGTT 
TTATGGACAA 


CCGTCATAGC 
GGCAGTATCG 


SI oi 


GATAACGAGC 
CTATTGCTCG 


TCCTGCACTG 
AGGACGTGAC 


GATGGTGGCG 
CTACCACCGC 


CTGGATGGTA 
GACCTACCAT 


AGCCGCTGGC 
TCGGCGACCG 


3451 


AAGCGGTGAA 
TTCGCCACTT 


GTGCCTCTGG 
CACGGAGACC 


ATGTCGCTCC 
TACAGCGAGG 


ACAAGGTAAA 
TGTTCCATTT 


CAGTTGATTG 
GTCAACTAAC 


3501 


AACTGCCTGA 
TTGACGGACT 


ACTACCGCAG 
TGATGGCGTC 


CCGGAGAGCG 
GGCCTCTCGC 


CCGGGCAACT 
GGCCCGTTGA 


CTGGCTCACA 
GACCGAGTGT 



3551 GTACGCGTAG TGCAACCGAA CGCGACCGCA TGGTCAGAAG CCGGGCACAT 
CATGCGCATC ACGTTGGCTT GCGCTGGCGT ACCAGTCTTC GGCCCGTGTA 



3601 CAGCGCCTGG CAGCAGTGGC GTCTGGCGGA AAACCTCAGT GTGACGCTCC 
GTCGCGGACC GTCGTCACCG CAGACCGCCT TTTGGAGTCA CACTGCGAGG 



3651 CCGCCGCGTC CCACGCCATC CCGCATCTGA CCACCAGCGA AATGGATTTT 
GGCGGCGCAG GGTGCGGTAG GGCGTAGACT GGTGGTCGCT TTACCTAAAA 



37 01 TGCATCGAGC TGGGTAATAA GCGTTGGCAA TTTAACCGCC AGTCAGGCTT 
ACGTAGCTCG ACCCATTATT CGCAACCGTT AAATTGGCGG TCAGTCCGAA 



37 51 TCTTTCACAG ATGTGGATTG GCGATAAAAA ACAACTGCTG ACGCCGCTGC 
AGAAAGTGTC TACACCTAAC CGCTATTTTT TGTTGACGAC TGCGGCGACG 
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3801 GCGATCAGTT CACCCGTGTC GATAGATCTG AACAGAAACT CATTTCCGAA 
CGCTAGTCAA GTGGGCACAG CTATCTAGAC TTGTCTTTGA GTAAAGGCTT 



3851 GAAGACCTAG TCGACCATCA TCATCATCAT CACCGGTAAT AATAGGTAGA 
CTTCTGGATC AGCTGGTAGT AGTAGTAGTA GTGGCCATTA TTATCCATCT 



3901 TAAGTGACTG ATTAGATGCA TTTCGACTAG ATCCCTCGAC CAATTCCGGT 
ATTCACTGAC TAATCTACGT AAAGCTGATC TAGGGAGCTG GTTAAGGCCA 



3 951 TATTTTCCAC CATATTGCCG TCTTTTGGCA ATGTGAGGGC CCGGAAACCT 
ATAAAAGGTG GTATAACGGC AGAAAACCGT TACACTCCCG GGCCTTTGGA 



4001 GGCCCTGTCT TCTTGACGAG CATTCCTAGG GGTCTTTCCC CTCTCGCCAA 
CCGGGACAGA AGAACTGCTC GTAAGGATCC CCAGAAAGGG GAGAGCGGTT 



4051 AGGAATGCAA GGTCTGTTGA ATGTCGTGAA GGAAGCAGTT CCTCTGGAAG 
TCCTTACGTT CCAGACAACT TACAGCACTT CCTTCGTCAA GGAGACCTTC 



4101 CTTCTTGAAG ACAAACAACG TCTGTAGCGA CCCTTTGCAG GCAGCGGAAC 
?=» GAAGAACTTC TGTTTGTTGC AGACATCGCT GGGAAACGTC CGTCGCCTTG 



II 51 CCCCCACCTG GCGACAGGTG CCTCTGCGGC CAAAAGCCAC GTGTATAAGA 
til GGGGGTGGAC CGCTGTCCAC GGAGACGCCG GTTTTCGGTG CACATATTCT 



"4201 TACACCTGCA AAGGCGGCAC AACCCCAGTG CCACGTTGTG AGTTGGATAG 
-L ATGTGGACGT TTCCGCCGTG TTGGGGTCAC GGTGCAACAC TCAACCTATC 



^£51 TTGTGGAAAG AGTCAAATGG CTCTCCTCAA GCGTATTCAA CAAGGGGCTG 
s AACACCTTTC TCAGTTTACC GAGAGGAGTT CGCATAAGTT GTTCCCCGAC 

Vl3 01 AAGGATGCCC AGAAGGTACC CCATTGTATG GGATCTGATC TGGGGCCTCG 
TTCCTACGGG TCTTCCATGG GGTAACATAC CCTAGACTAG ACCCCGGAGC 



k=651 gtgcacatgc tttacatgtg tttagtcgag gttaaaaaac gtctaggccc 
cacgtgtacg aaatgtacac aaatcagctc caattttttg cagatccggg 



4401 cccgaaccac ggggacgtgg ttttcctttg aaaaacacga tgataatacc 
gggcttggtg cccctgcacc aaaaggaaac tttttgtgct actattatgg 



4 451 atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga 
tactttttcg gacttgagtg gcgctgcaga cagctcttca aagactagct 



4501 AAAGTTCGAC AGCGTCTCCG ACCTGATGCA GCTCTCGGAG GGCGAAGAAT 
TTTCAAGCTG TCGCAGAGGC TGGACTACGT CGAGAGCCTC CCGCTTCTTA 



4551 CTCGTGCTTT CAGCTTCGAT GTAGGAGGGC GTGGATATGT CCTGCGGGTA 
GAGCACGAAA GTCGAAGCTA CATCCTCCCG CACCTATACA GGACGCCCAT 



4 601 AATAGCTGCG CCGATGGTTT CTACAAAGAT CGTTATGTTT ATCGGCACTT 
TTATCGACGC GGCTACCAAA GATGTTTCTA G C AAT AC AAA TAGCCGTGAA 



4 651 TGCATCGGCC GCGCTCCCGA TTCCGGAAGT GCTTGACATT GGGGAATTTA 
ACGTAGCCGG CGCGAGGGCT AAGGCCTTCA CGAACTGTAA CCCCTTAAAT 



47 01 GCGAGAGCCT GACCTATTGC ATCTCCCGCC GTGCACAGGG TGTCACGTTG 
CGCTCTCGGA CTGGATAACG TAGAGGGCGG CACGTGTCCC ACAGTGCAAC 
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47 51 CAAGACCTGC CTGAAACCGA ACTGCCCGCT GTTCTGCAGC CGGTCGCGGA 
GTTCTGGACG GACTTTGGCT TGACGGGCGA CAAGACGTCG GCCAGCGCCT 



4801 


GGCCATGGAT 
CCGGTACCTA 


GCGATCGCTG 
CGCTAGCGAC 


CGGCCGATCT 
GCCGGCTAGA 


TAGCCAGACG 
ATCGGTCTGC 


AGCGGGTTCG 
TCGCCCAAGC 


4851 


GCCCATTCGG 
CGGGTAAGCC 


ACCGCAAGGA 
TGGCGTTCCT 


ATCGGTCAAT 
TAGCCAGTTA 


ACACTACATG 
TGTGATGTAC 


GCGTGATTTC 
CGCACTAAAG 


4901 


ATATGCGCGA 
TATACGCGCT 


TTGCTGATCC 
AACGACTAGG 


CCATGTGTAT 
GGTACACATA 


CACTGGCAAA 
GTGACCGTTT 


CTGTGATGGA 
GACACTACCT 


4951 


CGACACCGTC 
GCTGTGGCAG 


AGTGCGTCCG 
TCACGCAGGC 


TCGCGCAGGC 
AGCGCGTCCG 


TCTCGATGAG 
AGAGCTACTC 


CTGATGCTTT 
GACTACGAAA 


5001 


GGGCCGAGGA 
CCCGGCTCCT 


CTGCCCCGAA 
GACGGGGCTT 


GTCCGGCACC 
CAGGCCGTGG 


TCGTGCACGC 
AGCACGTGCG 


GGATTTCGGC 
CCTAAAGCCG 


5051 


TCCAACAATG 
AGGTTGTTAC 


TCCTGACGGA 
AGGACTGCCT 


CAATGGCCGC 
GTTACCGGCG 


ATAACAGCGG 
TATTGTCGCC 


TCATTGACTG 
AGTAACTGAC 


sffioi 


GAGCGAGGCG 
CTCGCTCCGC 


ATGTTCGGGG 
TACAAGCCCC 


ATTCCCAATA 
TAAGGGTTAT 


CGAGGTCGCC 
GCTCCAGCGG 


AACATCTTCT 
TTGTAGAAGA 


§151 


TCTGGAGGCC 
AGACCTCCGG 


GTGGTTGGCT 
CACCAACCGA 


TGTATGGAGC 
ACATACCTCG 


AGCAGACGCG 
TCGTCTGCGC 


CTACTTCGAG 
GATGAAGCTC 


Sloi 


CGGAGGCATC 
GCCTCCGTAG 


CGGAGCTTGC 
GCCTCGAACG 


AGGATCGCCG 
TCCTAGCGGC 


CGGCTCCGGG 
GCCGAGGCCC 


CGTATATGCT 
GCATATACGA 


hisi 


CCGCATTGGT 
GGCGTAACCA 


CTTGACCAAC 
GAACTGGTTG 


TCTATCAGAG 
AGATAGTCTC 


CTTGGTTGAC 
GAACCAACTG 


GGCAATTTCG 
CCGTTAAAGC 


moi 


AT GAT G C AG C 
TACTACGTCG 


TTGGGCGCAG 
AACCCGCGTC 


GGTCGATGCG 
CCAGCTACGC 


ACGCAATCGT 
TGCGTTAGCA 


CCGATCCGGA 
GGCTAGGCCT 


5351 


GCCGGGACTG 
CGGCCCTGAC 


TCGGGCGTAC 
AGCCCGCATG 


ACAAATCGCC 
TGTTTAGCGG 


CGCAGAAGCG 
GCGTCTTCGC 


CGGCCGTCTG 
GCCGGCAGAC 


5401 


GACCGATGGC 
CTGGCTACCG 


TGTGTAGAAG 
ACACATCTTC 


TACTCGCCGA 
ATGAGCGGCT 


TAGTGGAAAC 
ATCACCTTTG 


CGACGCCCCA 
GCTGCGGGGT 


5451 


GCACTCGTCC 
CGTGAGCAGG 


GAGGGCAAAG 
CTCCCGTTTC 


GAATAGAGTA 
CTTATCTCAT 


GATGCCGACC 
CTACGGCTGG 


GGGATCTATC 
CCCTAGATAG 



5501 GATAAAATAA AAGATTTTAT TTAGTCTCCA GAAAAAGGGG GGAATGAAAG 
CTATTTTATT TTCTAAAATA AATCAGAGGT CTTTTTCCCC CCTTACTTTC 



5551 ACCCCACCTG TAGGTTTGGC AAGCTAGCTT AAGTAACGCC ATTTTGCAAG 
TGGGGTGGAC AT CCAAACCG TTCGATCGAA TTCATTGCGG TAAAACGTTC 



5601 GCATGGAAAA ATACATAACT GAGAATAGAG AAGTTCAGAT CAAGGTCAGG 
CGTACCTTTT TATGTATTGA CTCTTATCTC TTCAAGTCTA GTTCCAGTCC 



5651 AACAGATGGA ACAGCTGAAT ATGGGCCAAA CAGGATATCT GTGGTAAGCA 
TTGTCTACCT TGTCGACTTA TACCCGGTTT GTCCTATAGA CACCATTCGT 
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5701 GTTCCTGCCC CGGCTCAGGG CCAAGAACAG ATGGAACAGC TGAATATGGG 
CAAGGACGGG GCCGAGTCCC GGTTCTTGTC TACCTTGTCG ACTTATACCC 



57 51 CCAAACAGGA TATCTGTGGT AAGCAGTTCC TGCCCCGGCT CAGGGCCAAG 
GGTTTGTCCT ATAGACACCA TTCGTCAAGG ACGGGGCCGA GTCCCGGTTC 



5801 AACAGATGGT CCCCAGATGC GGTCCAGCCC TCAGCAGTTT C T AG AG AAC C 
TTGTCTACCA GGGGTCTACG CCAGGTCGGG AGTCGTCAAA GATCTCTTGG 



5851 ATCAGATGTT TCCAGGGTGC CCCAAGGACC TGAAATGACC CTGTGCCTTA 
TAGTCTACAA AGGTCCCACG GGGTTCCTGG ACTTTACTGG GACACGGAAT 



5901 TTTGAACTAA CCAATCAGTT CGCTTCTCGC TTCTGTTCGC GCGCTTCTGC 
AAACTTGATT GGTTAGTCAA GCGAAGAGCG AAGACAAGCG CGCGAAGACG 



5951 TCCCCGAGCT CAATAAAAGA GCCCACAACC CCTCACTCGG GGCGCCAGTC 
AGGGGCTCGA GTTATTTTCT CGGGTGTTGG GGAGTGAGCC CCGCGGTCAG 



fiSbl CTCCGATTGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAACCCTCTT 
fS GAGGCTAACT GACTCAGCGG GCCCATGGGC ACATAGGTTA TTTGGGAGAA 



€051 GCAGTTGCAT CCGACTTGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCT 
CGTCAACGTA GGCTGAACAC CAGAGCGACA AGGAACCCTC CCAGAGGAGA 



eit'Ol GAGTGATTGA CTACCCGTCA GCGGGGGTCT TTCATTCATG CAGCATGTAT 
% \i CTCACTAACT GATGGGCAGT CGCCCCCAGA AAGTAAGTAC GTCGTACATA 



§151 CAAAATTAAT TTGGTTTTTT TTCTTAAGTA TTTACATTAA ATGGCCATAG 
Q GTTTTAATTA AACCAAAAAA AAGAATTCAT AAATGTAATT TACCGGTATC 



6201 TTGCATTAAT GAATCGGCCA ACGCGCGGGG AGAGGCGGTT TGCGTATTGG 
U : AACGTAATTA CTTAGCCGGT TGCGCGCCCC TCTCCGCCAA ACGCATAACC 



II 51 CGCTCTTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG TCGTTCGGCT 
W GCGAGAAGGC GAAGGAGCGA GTGACTGAGC GACGCGAGCC AGCAAGCCGA 



6301 GCGGCGAGCG GTATCAGCTC ACTCAAAGGC GGTAATACGG TTATCCACAG 
CGCCGCTCGC CATAGTCGAG TGAGTTTCCG CCATTATGCC AATAGGTGTC 



6351 AATCAGGGGA TAACGCAGGA AAGAACATGT GAGCAAAAGG CCAGCAAAAG 
TTAGTCCCCT ATTGCGTCCT TTCTTGTACA CTCGTTTTCC GGTCGTTTTC 



64 01 GCCAGGAACC GTAAAAAGGC CGCGTTGCTG GCGTTTTTCC ATAGGCTCCG 
CGGTCCTTGG CATTTTTCCG GCGCAACGAC CGCAAAAAGG TATCCGAGGC 



64 51 CCCCCCTGAC GAGCATCACA AAAATCGACG CTCAAGTCAG AGGTGGCGAA 
GGGGGGACTG CTCGTAGTGT TTTTAGCTGC GAGTTCAGTC TCCACCGCTT 



6501 ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG AAGCTCCCTC 
TGGGCTGTCC TGATATTTCT ATGGTCCGCA AAGGGGGACC TTCGAGGGAG 



6551 GTGCGCTCTC CTGTTCCGAC CCTGCCGCTT ACCGGATACC TGTCCGCCTT 
CACGCGAGAG GACAAGGCTG GGACGGCGAA TGGCCTATGG ACAGGCGGAA 



6601 TCTCCCTTCG GGAAGCGTGG CGCTTTCTCA TAGCTCACGC TGTAGGTATC 
AGAGGGAAGC CCTTCGCACC GCGAAAGAGT ATCGAGTGCG ACATCCATAG 
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6 651 TCAGTTCGGT GTAGGTCGTT CGCTCCAAGC TGGGCTGTGT GCACGAACCC 
AGTCAAGCCA CATCCAGCAA GCGAGGTTCG ACCCGACACA CGTGCTTGGG 



67 01 CCCGTTCAGC CCGACCGCTG CGCCTTATCC GGTAACTATC GTCTTGAGTC 
GGGCAAGTCG GGCTGGCGAC GCGGAATAGG CCATTGATAG CAGAACTCAG 



6751 CAACCCGGTA AGACACGACT TATCGCCACT GGCAGCAGCC ACTGGTAACA 
GTTGGGCCAT TCTGTGCTGA ATAGCGGTGA CCGTCGTCGG TGACCATTGT 



68 01 GGATTAGCAG AGCGAGGTAT GTAGGCGGTG CTACAGAGTT CTTGAAGTGG 
CCTAATCGTC TCGCTCCATA CATCCGCCAC GATGTCTCAA GAACTTCACC 



6851 TGGCCTAACT ACGGCTACAC TAGAAGAACA GTATTTGGTA TCTGCGCTCT 
ACCGGATTGA TGCCGATGTG ATCTTCTTGT CATAAACCAT AGACGCGAGA 



6901 GCTGAAGCCA GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA 
CGACTTCGGT CAATGGAAGC CTTTTTCTCA ACCATCGAGA ACTAGGCCGT 



6951 AACAAACCAC CGCTGGTAGC GGTGGTTTTT TTGTTTGCAA GCAGCAGATT 
TTGTTTGGTG GCGACCATCG CCACCAAAAA AACAAACGTT CGTCGTCTAA 



WODOl ACGCGCAGAA AAAAAGGATC TCAAGAAGAT CCTTTGATCT TTTCTACGGG 
fn TGCGCGTCTT TTTTTCCTAG AGTTCTTCTA GGAAACTAGA AAAGATGCCC 



JK)51 GTCTGACGCT CAGTGGAACG AAAACT CACG TTAAGGGATT TTGGTCATGA 
CAGACTGCGA GTCACCTTGC TTTTGAGTGC AATTCCCTAA AACCAGTACT 



: ii01 GAT TAT C AAA AAGGATCTTC ACCTAGATCC TTTTAAATTA AAAATGAAGT 
%i* CTAATAGTTT TTCCTAGAAG TGGATCTAGG AAAATTTAAT TTTTACTTCA 



Ml 51 TTGCGGCCGC AAATCAATCT AAAGTATATA TGAGTAAACT TGGTCTGACA 
%Q AACGCCGGCG TTTAGTTAGA TTTCATATAT ACTCATTTGA ACCAGACTGT 



R201 GTTACCAATG CTTAATCAGT GAGGCACCTA TCTCAGCGAT CTGTCTATTT 
fl CAATGGTTAC GAATTAGTCA CTCCGTGGAT AGAGTCGCTA GACAGATAAA 



7251 CGTTCATCCA TAGTTGCCTG ACTCCCCGTC GTGTAGATAA CTACGATACG 
GCAAGTAGGT ATCAACGGAC TGAGGGGCAG CACATCTATT GATGCTATGC 



7301 GGAGGGCTTA CCATCTGGCC CCAGTGCTGC AATGATACCG CGAGACCCAC 
CCTCCCGAAT GGTAGACCGG GGTCACGACG TTACTATGGC GCTCTGGGTG 



7351 GCTCACCGGC TCCAGATTTA TCAGCAATAA ACCAGCCAGC CGGAAGGGCC 
CGAGTGGCCG AGGTCTAAAT AGTCGTTATT TGGTCGGTCG GCCTTCCCGG 



74 01 GAGCGCAGAA GTGGTCCTGC AACTTTATCC GCCTCCATCC AGTCTATTAA 
CTCGCGTCTT C AC C AG G AC G TTGAAATAGG CGGAGGTAGG TCAGATAATT 



7451 TTGTTGCCGG GAAGCTAGAG TAAGTAGTTC GCCAGTTAAT AGTTTGCGCA 
AACAACGGCC CTTCGATCTC ATTCATCAAG CGGTCAATTA TCAAACGCGT 



7501 ACGTTGTTGC CATTGCTACA GGCATCGTGG TGTCACGCTC GTCGTTTGGT 
TGCAACAACG GTAACGATGT CCGTAGCACC ACAGTGCGAG CAGCAAACCA 



7551 ATGGCTTCAT TCAGCTCCGG TTCCCAACGA TCAAGGCGAG TTACATGATC 
TACCGAAGTA AGTCGAGGCC AAGGGTTGCT AGTTCCGCTC AATGTACTAG 
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7601 


CCCCATGTTG 


TGCAAAAAAG 


CGGTTAGCTC 


CTTCGGTCCT 


CCGATCGTTG 








GPP A ATPG AG 


GAAGCCAGGA 


GGCTAGCAAC 


7651 


TCAGAAGTAA 


GTTGGCCGCA 


GTGTTATCAC 


TCATGGTTAT 


GGCAGCACTG 
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PGGTAGGPAT 


TCTACGAAAA 


GACACTGACC 


7751 


TGAGTACTCA 


ACCAAGTCAT 


TCTGAGAATA 


GTGTATGCGG 


CGACCGAGTT 




At 1 L, A 1 (jj Ab 1 


1 KjKj 1 1 Lnb J. t\ 


AI^APTrTTAT 
rt.'vjrjriv— - 1^/ J. 1 ri x 


PAPATAPGCC 


GCTGGCTCAA 


7801 


GCTCTTGCCC 


GGCGTCAATA 


CGGGATAATA 


CCGCGCCACA 


TAGCAGAACT 




U bAb AAL, bjla b 


bbbL-no 1 1 X 




GGPGCGGTGT 


ATCGTCTTGA 


7851 


TTAAAAGTGC 


TCATCATTGG 


AAAACGTTCT 


TCGGGGCGAA 


AACTCTCAAG 




AA 1 1 I 1 LAL-bj 




TTTTPPAAGA 

J. X X J. vj^nrivjri. 


AGCCCCGCTT 


TTGAGAGTTC 


ISoi 


GATCTTACCG 


CTGTTGAGAT 


CCAGTTCGAT 


GTAACCCACT 


CGTGCACCCA 




r^T 1 7\ 7\ 7\ Trrr 

b> 1 AbAA 1 bjbiL, 


briL-nri'w 1 J. ri 


PPTPAAGCTA 


CATTGGGTGA 


GCACGTGGGT 


f!?51 


ACTGATCTTC 


AGCATCTTTT 


ACTTTCACCA 


GCGTTTCTGG 


GTGAGCAAAA 




TGACTAGAAG 


1 Lb 1 AbiAAAA 


1 bnnrib x oo x 


prpA AAPAPP 


PAPTCGTTTT 




ACAGGAAGGC 


AAAATGCCGC 


AAAAAAGGGA 


ATAAGGGCGA 


CACGGAAATG 




TGTCCTTCCG 


111 1 Ab-.UH^bj 


11111 l^^^l 


T ATTPPPPPT 
x riL 1 x x 


GTGCCTTTAC 


|p51 


TTGAATACTC 


ATACTCTTCC 


TTTTTCAATA 


TTATTGAAGC 


ATTTATCAGG 


AACTTA 1 GAG 


1 A 1 bAbAiibu 


AAA A APTT AT 


a ATAAPTTPG 


TAAATAGTCC 


lioi 
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GAAAAATAAA 
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CAAATAGGGG 
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AAGGCGCGTG 


TAAAG 
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1 


CTGCAGCCTG 
GACGTCGGAC 


AATATGGGCC 
TTATACCCGG 


AAACAGGATA 
TTTGTCCTAT 


TCTGTGGTAA 
AGACACCATT 


GCAGTTCCTG 
CGTCAAGGAC 


51 


CCCCGGCTCA 
GGGGCCGAGT 


GGGCCAAGAA 
CCCGGTTCTT 


CAGATGGAAC 
GTCTACCTTG 


AGCTGAATAT 
TCGACTTATA 


GGGCCAAACA 
CCCGGTTTGT 


101 


GGATATCTGT 
CCTATAGACA 


GGTAAGCAGT 
CCATTCGTCA 


TCCTGCCCCG 
AGGACGGGGC 


GCTCAGGGCC 
CGAGTCCCGG 


AAGAACAGAT 
TTCTTGTCTA 


151 


GGTCCCCAGA 
CCAGGGGTCT 


TGCGGTCCAG 
ACGCCAGGTC 


CCCTCAGCAG 
GGGAGTCGTC 


TTTCTAGAGA 
AAAGATCTCT 


ACCATCAGAT 
TGGTAGTCTA 


201 


GTTTCCAGGG 
CAAAGGTCCC 


TGCCCCAAGG 
ACGGGGTTCC 


ACCTGAAATG 
TGGACTTTAC 


ACCCTGTGCC 
TGGGACACGG 


TTATTTGAAC 
AATAAACTTG 


251 


TAACCAATCA 
ATTGGTTAGT 


GTTCGCTTCT 
CAAGCGAAGA 


CGCTTCTGTT 
GCGAAGACAA 


CGCGCGCTTC 
GCGCGCGAAG 


TGCTCCCCGA 
ACGAGGGGCT 


301 


GCTCAATAAA 
CGAGTTATTT 


AGAGCCCACA 
TCTCGGGTGT 


ACCCCTCACT 
TGGGGAGTGA 


CGGGGCGCCA 
GCCCCGCGGT 


GTCCTCCGAT 
CAGGAGGCTA 


■it 51 


TGACTGAGTC 
ACTGACTCAG 


GCCCGGGTAC 
CGGGCCCATG 


CCGTGTATCC 
GGCACATAGG 


AATAAACCCT 
TTATTTGGGA 


CTTGCAGTTG 
GAACGTCAAC 


^Roi 


CATCCGACTT 
GTAGGCTGAA 


GTGGTCTCGC 
CACCAGAGCG 


TGTTCCTTGG 
ACAAGGAACC 


GAGGGTCTCC 
CTCCCAGAGG 


TCTGAGTGAT 
AGACTCACTA 


JJ51 


TGACTACCCG 
ACTGATGGGC 


TCAGCGGGGG 
AGTCGCCCCC 


TCTTTCATTT 
AGAAAGTAAA 


GGGGGCTCGT 
CCCCCGAGCA 


CCGGGATCGG 
GGCCCTAGCC 


%01 


GAGACCCCTG 
CTCTGGGGAC 


CCCAGGGACC 
GGGTCCCTGG 


ACCGACCCAC 
TGGCTGGGTG 


CACCGGGAGG 
GTGGCCCTCC 


CAAGCTGGCC 
GTTCGACCGG 


351 


AGCAACTTAT 
TCGTTGAATA 


CTGTGTCTGT 
GACACAGACA 


CCGATTGTCT 
GGCTAACAGA 


AGTGTCTATG 
T C AC AG AT AC 


ACTGATTTTA 
TGACTAAAAT 


601 


TGCGCCTGCG 
ACGCGGACGC 


TCGGTACTAG 
AGCCATGATC 


TTAGCTAACT 
AATCGATTGA 


AGCTCTGTAT 
TCGAGACATA 


CTGGCGGACC 
GACCGCCTGG 


651 


CGTGGTGGAA 
GCACCACCTT 


CT G AC GAG T T 
GACTGCTCAA 


CTGAACACCC 
GACTTGTGGG 


GGCCGCAACC 
CCGGCGTTGG 


CTGGGAGACG 
GACCCTCTGC 


701 


TCCCAGGGAC 
AGGGTCCCTG 


TTTGGGGGCC 
AAACCCCCGG 


GTTTTTGTGG 
CAAAAACACC 


CCCGACCTGA 
GGGCTGGACT 


GGAAGGGAGT 
CCTTCCCTCA 


751 


CGATGTGGAA 
GCTACACCTT 


TCCGACCCCG 
AGGCTGGGGC 


TCAGGATATG 
AGTCCTATAC 


TGGTTCTGGT 
ACCAAGACCA 


AGGAGACGAG 
TCCTCTGCTC 


801 


AACCTAAAAC 
TTGGATTTTG 


AGTTCCCGCC 
TCAAGGGCGG 


TCCGTCTGAA 
AGGCAGACTT 


TTTTTGCTTT 
AAAAACGAAA 


CGGTTTGGAA 
GCCAAACCTT 


851 


CCGAAGCCGC 
GGCTTCGGCG 


GCGTCTTGTC 
CGCAGAACAG 


TGCTGCAGCA 
ACGACGTCGT 


TCGTTCTGTG 
AGCAAGACAC 


TTGTCTCTGT 
AACAGAGACA 


901 


CTGACTGTGT 
GACTGACACA 


TTCTGTATTT 
. AAGACATAAA 


GTCTGAAAAT 
CAGACTTTTA 


TAGGGCCAGA 
. ATCCCGGTCT 


CTGTTACCAC 
GACAATGGTG 
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951 TCCCTTAAGT TTGACCTTAG GTAACTGGAA AGATGTCGAG CGGCTCGCTC 
AGGGAATTCA AACTGGAATC CATTGACCTT TCTACAGCTC GCCGAGCGAG 



1001 


ACAACCAGTC 
TGTTGGTCAG 


GGTAGATGTC 
CCATCTACAG 


AAGAAGAGAC 
TTCTTCTCTG 


GTTGGGTTAC 
CAACCCAATG 


CTTCTGCTCT 
GAAGACGAGA 


1051 


GCAGAATGGC 
CGTCTTACCG 


CAACCTTTAA 
GTTGGAAATT 


CGTCGGATGG 
GCAGCCTACC 


CCGCGAGACG 
GGCGCTCTGC 


GCACCTTTAA 
CGTGGAAATT 


1101 


CCGAGACCTC 
GGCTCTGGAG 


ATCACCCAGG 
TAGTGGGTCC 


TTAAGATCAA 
AATTCTAGTT 


GGTCTTTTCA 
CCAGAAAAGT 


CCTGGCCCGC 
GGACCGGGCG 


1151 


ATGGACACCC 
TACCTGTGGG 


AGACCAGGTC 
TCTGGTCCAG 


CCCTACATCG 
GGGATGTAGC 


TGACCTGGGA 
ACTGGACCCT 


AGCCTTGGCT 
TCGGAACCGA 


1201 


TTTGACCCCC 
AAACTGGGGG 


CTCCCTGGGT 
GAGGGACCCA 


CAAGCCCTTT 
GTTCGGGAAA 


GTACACCCTA 
CATGTGGGAT 


AGCCTCCGCC 
TCGGAGGCGG 


1251 


TCCTCTTCCT 
AGGAGAAGGA 


CCATCCGCCC 
GGTAGGCGGG 


CGTCTCTCCC 
GCAGAGAGGG 


CCTTGAACCT 
GGAACTTGGA 


CCTCGTTCGA 
GGAGCAAGCT 


3|301 


CCCCGCCTCG 
GGGGCGGAGC 


ATCCTCCCTT 
TAGGAGGGAA 


TATCCAGCCC 
ATAGGTCGGG 


TCACTCCTTC 
AGTGAGGAAG 


TCTAGGCGCC 
AGATCCGCGG 


^851 


GGCCGCTCTA 
CCGGCGAGAT 


GCCCATTAAT 
CGGGTAATTA 


ACGACTCACT 
TGCTGAGTGA 


ATAGGGCGAT 
TATCCCGCTA 


TCGAACACCA 
AGCTTGTGGT 




TGCACCATCA 
ACGTGGTAGT 


T CAT CATC AC 
AGTAGTAGTG 


GTCGACGAAC 
CAGCTGCTTG 


AGAAACTCAT 
TCTTTGAGTA 


TTCCGAAGAA 
AAGGCTTCTT 




GACCTACTCG 
CTGGATGAGC 


AGATGGGCGT 
TCTACCCGCA 


GATTACGGAT 
CTAATGCCTA 


TCACTGGCCG 
AGTGACCGGC 


TCGTTTTACA 
AGCAAAATGT 




ACGTCGTGAC 
TGCAGCACTG 


TGGGAAAACC 
ACCCTTTTGG 


CTGGCGTTAC 
GACCGCAATG 


CCAACTTAAT 
GGTTGAATTA 


CGCCTTGCAG 
GCGGAACGTC 


1551 


CACATCCCCC 
GTGTAGGGGG 


TTTCGCCAGC 
AAAGCGGTCG 


TGGCGTAATA 
ACCGCATTAT 


GCGAAGAGGC 
CGCTTCTCCG 


CCGCACCGAT 
GGCGTGGCTA 


1601 


CGCCCTTCCC 
GCGGGAAGGG 


AACAGTTACG 
TTGTCAATGC 


CAGCCTGAAT 
GTCGGACTTA 


GGCGAATGGC 
CCGCTTACCG 


GCTTTGCCTG 
CGAAACGGAC 


1651 


GTTTCCGGCA 
CAAAGGCCGT 


CCAGAAGCGG 
GGTCTTCGCC 


TGCCGGAAAG 
ACGGCCTTTC 


CTGGCTGGAG 
GACCGACCTC 


TGCGATCTTC 
ACGCTAGAAG 



1701 CTGAGGCCGA TACTGTCGTC GTCCCCTCAA ACTGGCAGAT GCACGGTTAC 
GACTCCGGCT ATGACAGCAG CAGGGGAGTT TGACCGTCTA CGTGCCAATG 



17 51 GATGCGCCCA TCTACACCAA CGTGACCTAT CCCATTACGG TCAATCCGCC 
CTACGCGGGT AGATGTGGTT GCACTGGATA GGGTAATGCC AGTTAGGCGG 



18 01 GTTTGTTCCC ACGGAGAATC CGACGGGTTG TTACTCGCTC ACATTTAATG 
CAAACAAGGG TGCCTCTTAG GCTGCCCAAC AATGAGCGAG TGTAAATTAC 



1851 TTGATGAAAG CTGGCTACAG GAAGGCCAGA CGCGAATTAT TTTTGATGGC 
AACTACTTTC GACCGATGTC CTTCCGGTCT GCGCTTAATA AAAACTACCG 
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1901 GTTAACTCGG CGTTTCATCT GTGGTGCAAC GGGCGCTGGG TCGGTTACGG 
CAATTGAGCC GCAAAGTAGA CACCACGTTG CCCGCGACCC AGCCAATGCC 



1951 CCAGGACAGT CGTTTGCCGT CTGAATTTGA CCTGAGCGCA TTTTTACGCG 
GGTCCTGTCA GCAAACGGCA GACTTAAACT GGACTCGCGT AAAAATGCGC 



2001 CCGGAGAAAA CCGCCTCGCG GTGATGGTGC TGCGCTGGAG TGACGGCAGT 
GGCCTCTTTT GGCGGAGCGC CACTACCACG ACGCGACCTC ACTGCCGTCA 



2051 TATCTGGAAG ATCAGGATAT GTGGCGGATG AGCGGCATTT TCCGTGACGT 
ATAGACCTTC TAGTCCTATA CACCGCCTAC TCGCCGTAAA AGGCACTGCA 



2101 CTCGTTGCTG CATAAACCGA CTACACAAAT CAGCGATTTC CATGTTGCCA 
GAGCAACGAC GTATTTGGCT GATGTGTTTA GTCGCTAAAG GTACAACGGT 



2151 CTCGCTTTAA TGATGATTTC AGCCGCGCTG TACTGGAGGC TGAAGTTCAG 
GAGCGAAATT ACTACTAAAG TCGGCGCGAC ATGACCTCCG ACTTCAAGTC 



2201 ATGTGCGGCG AGTTGCGTGA CTACCTACGG GTAACAGTTT CTTTATGGCA 
TACACGCCGC TCAACGCACT GATGGATGCC CATTGTCAAA GAAATACCGT 



if 51 GGGTGAAACG CAGGTCGCCA GCGGCACCGC GCCTTTCGGC GGTGAAATTA 
01 CCCACTTTGC GTCCAGCGGT CGCCGTGGCG CGGAAAGCCG CCACTTTAAT 



41 01 TCGATGAGCG TGGTGGTTAT GCCGATCGCG TCACACTACG TCTGAACGTC 
J£ AGCTACTCGC ACCACCAATA CGGCTAGCGC AGTGTGATGC AGACTTGCAG 



£J51 GAAAACCCGA AACTGTGGAG CGCCGAAATC CCGAATCTCT ATCGTGCGGT 
■-" CTTTTGGGCT TTGACACCTC GCGGCTTTAG GGCTTAGAGA TAGCACGCCA 



SOI GGTTGAACTG CACACCGCCG ACGGCACGCT GATTGAAGCA GAAGCCTGCG 
CCAACTTGAC GTGTGGCGGC TGCCGTGCGA CTAACTTCGT CTTCGGACGC 



j2451 ATGTCGGTTT CCGCGAGGTG CGGATTGAAA ATGGTCTGCT GCTGCTGAAC 
Q TACAGCCAAA GGCGCTCCAC GCCTAACTTT TACCAGACGA CGACGACTTG 



2501 GGCAAGCCGT TGCTGATTCG AGGCGTTAAC CGTCACGAGC ATCATCCTCT 
CCGTTCGGCA ACGACTAAGC TCCGCAATTG GCAGTGCTCG TAGTAGGAGA 



2551 GCATGGTCAG GTCATGGATG AGCAGACGAT GGTGCAGGAT ATCCTGCTGA 
CGTACCAGTC CAGTACCTAC TCGTCTGCTA CCACGTCCTA TAGGACGACT 



2 601 TGAAGCAGAA CAACTTTAAC GCCGTGCGCT GTTCGCATTA TCCGAACCAT 
ACTTCGTCTT GTTGAAATTG CGGCACGCGA CAAGCGTAAT AGGCTTGGTA 



2 651 CCGCTGTGGT ACACGCTGTG CGACCGCTAC GGCCTGTATG TGGTGGATGA 
GGCGACACCA TGTGCGACAC GCTGGCGATG CCGGACATAC ACCACCTACT 



27 01 AGCCAATATT GAAACCCACG GCATGGTGCC AATGAATCGT CTGACCGATG 
TCGGTTATAA CTTTGGGTGC CGTACCACGG TTACTTAGCA GACTGGCTAC 



2751 ATCCGCGCTG GCTACCGGCG ATGAGCGAAC GCGTAACGCG AATGGTGCAG 
TAGGCGCGAC CGATGGCCGC TACTCGCTTG CGCATTGCGC TTACCACGTC 



2801 CGCGATCGTA ATCACCCGAG TGTGATCATC TGGTCGCTGG GGAATGAATC 
GCGCTAGCAT TAGTGGGCTC ACACTAGTAG ACCAGCGACC CCTTACTTAG 
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2851 AGGCCACGGC GCTAATCACG ACGCGCTGTA TCGCTGGATC AAATCTGTCG 
TCCGGTGCCG CGATTAGTGC TGCGCGACAT AGCGACCTAG TTTAGACAGC 



2901 ATCCTTCCCG CCCGGTGCAG TATGAAGGCG GCGGAGCCGA CACCACGGCC 
TAGGAAGGGC GGGCCACGTC ATACTTCCGC CGCCTCGGCT GTGGTGCCGG 



2 951 ACCGATATTA TTTGCCCGAT GTACGCGCGC GTGGATGAAG ACCAGCCCTT 
TGGCTATAAT AAACGGGCTA CATGCGCGCG CACCTACTTC TGGTCGGGAA 



3001 CCCGGCTGTG CCGAAATGGT CCATCAAAAA ATGGCTTTCG CTACCTGGAG 
GGGCCGACAC GGCTTTACCA GGTAGTTTTT TACCGAAAGC GATGGACCTC 



3051 AGACGCGCCC GCTGATCCTT TGCGAATACG CCCACGCGAT GGGTAACAGT 
TCTGCGCGGG CGACTAGGAA ACGCTTATGC GGGTGCGCTA CCCATTGTCA 



3101 CTTGGCGGTT TCGCTAAATA CTGGCAGGCG TTTCGTCAGT ATCCCCGTTT 
GAACCGCCAA AGCGATTTAT GACCGTCCGC AAAGC AG T C A TAGGGGCAAA 



3151 ACAGGGCGGC TTCGTCTGGG ACTGGGTGGA TCAGTCGCTG ATTAAATATG 
TGTCCCGCCG AAGCAGACCC TGACCCACCT AGTCAGCGAC TAATTTATAC 



til 01 ATGAAAACGG CAACCCGTGG TCGGCTTACG GCGGTGATTT TGGCGATACG 
01 TACTTTTGCC GTTGGGCACC AGCCGAATGC CGCCACTAAA ACCGCTATGC 



=5251 CCGAACGATC GCCAGTTCTG TATGAACGGT CTGGTCTTTG CCGACCGCAC 
J~ GGCTTGCTAG CGGTCAAGAC ATACTTGCCA GACCAGAAAC GGCTGGCGTG 



J|01 GCCGCATCCA GCGCTGACGG AAGCAAAACA CCAGCAGCAG TTTTTCCAGT 
™ CGGCGTAGGT CGCGACTGCC TTCGTTTTGT GGTCGTCGTC AAAAAGGTCA 



bj51 TCCGTTTATC CGGGCAAACC ATCGAAGTGA CCAGCGAATA CCTGTTCCGT 
V-J AGGCAAATAG GCCCGTTTGG TAGCTTCACT GGTCGCTTAT GGACAAGGCA 



b^Ol CATAGCGATA ACGAGCTCCT GCACTGGATG GTGGCGCTGG ATGGTAAGCC 
O GTATCGCTAT TGCTCGAGGA CGTGACCTAC CACCGCGACC TACCATTCGG 



3451 GCTGGCAAGC GGTGAAGTGC CTCTGGATGT CGCTCCACAA GGTAAACAGT 
CGACCGTTCG CCACTTCACG GAGACCTACA GCGAGGTGTT CCATTTGTCA 



3501 TGATTGAACT GCCTGAACTA CCGCAGCCGG AGAGCGCCGG GCAACTCTGG 
ACTAACTTGA CGGACTTGAT GGCGTCGGCC TCTCGCGGCC CGTTGAGACC 



3551 CTCACAGTAC GCGTAGTGCA ACCGAACGCG ACCGCATGGT CAGAAGCCGG 
GAGTGTCATG CGCATCACGT TGGCTTGCGC TGGCGTACCA GTCTTCGGCC 



3601 GCACATCAGC GCCTGGCAGC AGTGGCGTCT GGCGGAAAAC CTCAGTGTGA 
CGTGTAGTCG CGGACCGTCG TCACCGCAGA CCGCCTTTTG GAGTCACACT 



3651 CGCTCCCCGC CGCGTCCCAC GCCATCCCGC ATCTGACCAC CAGCGAAATG 
GCGAGGGGCG GCGCAGGGTG CGGTAGGGCG TAGACTGGTG GTCGCTTTAC 



3701 GATTTTTGCA TCGAGCTGGG TAATAAGCGT TGGCAATTTA ACCGCCAGTC 
CTAAAAACGT AGCTCGACCC ATTATTCGCA ACCGTTAAAT TGGCGGTCAG 



37 51 AGGCTTTCTT T C AC AG A T G T GGATTGGCGA TAAAAAACAA CTGCTGACGC 
TCCGAAAGAA AGTGTCTACA CCTAACCGCT ATTTTTTGTT GACGACTGCG 
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3801 


CGCTGCGCGA 
GCGACGCGCT 


TCAGTTCACC 
AGTCAAGTGG 


CGTGTCGATA 
GCACAGCTAT 


GATCTGGAGG 
CTAGACCTCC 


TGGTGGCAGC 
ACCACCGTCG 


3851 


AGGCCTTGGC 
TCCGGAACCG 


GCGCCGGATC 
CGCGGCCTAG 


CTTAATTAAC 
GAATTAATTG 


AATTGACCGG 
TTAACTGGCC 


TAATAATAGG 
ATT AT T ATCC 


3901 


TAGATAAGTG 
ATCTATTCAC 


ACTGATTAGA 
TGACTAATCT 


TGCATTTCGA 
ACGTAAAGCT 


CTAGATCCCT 
GATCTAGGGA 


CGACCAATTC 
GCTGGTTAAG 


3951 


CGGTTATTTT 
GCCAATAAAA 


CCACCATATT 
GGTGGTATAA 


GCCGTCTTTT 
CGGCAGAAAA 


GGCAATGTGA 
CCGTTACACT 


GGGCCCGGAA 
CCCGGGCCTT 


4001 


ACCTGGCCCT 
TGGACCGGGA 


GTCTTCTTGA 
CAGAAGAACT 


CGAGCATTCC 
GCTCGTAAGG 


TAGGGGTCTT 
ATCCCCAGAA 


TCCCCTCTCG 
AGGGGAGAGC 


4051 


CCAAAGGAAT 
GGTTTCCTTA 


GCAAGGTCTG 
CGTTCCAGAC 


TTGAATGTCG 
AACTTACAGC 


TGAAGGAAGC 
ACTTCCTTCG 


AGTTCCTCTG 
TCAAGGAGAC 


4101 


GAAGCTTCTT 
CTTCGAAGAA 


GAAGACAAAC 
CTTCTGTTTG 


AACGTCTGTA 
TTGCAGACAT 


GCGACCCTTT 
CGCTGGGAAA 


GCAGGCAGCG 
CGTCCGTCGC 


M51 


GAACCCCCCA 
CTTGGGGGGT 


CCTGGCGACA 
GGACCGCTGT 


GGTGCCTCTG 
CCACGGAGAC 


CGGCCAAAAG 
GCCGGTTTTC 


CCACGTGTAT 
GGTGCACATA 


M oi 


AAGATACACC 
TTCTATGTGG 


TGCAAAGGCG 
ACGTTTCCGC 


GCACAACCCC 
CGTGTTGGGG 


AGTGCCACGT 
TCACGGTGCA 


TGTGAGTTGG 
ACACTCAACC 


3$ 51 


ATAGTTGTGG 
TATCAACACC 


AAAGAGTCAA 
TTTCTCAGTT 


ATGGCTCTCC 
TACCGAGAGG 


TCAAGCGTAT 
AGTTCGCATA 


TCAACAAGGG 
AGTTGTTCCC 


H^Ol 


GCTGAAGGAT 
CGACTTCCTA 


GCCCAGAAGG 
CGGGTCTTCC 


TACCCCATTG 
ATGGGGTAAC 


TATGGGATCT 
ATACCCTAGA 


GATCTGGGGC 
CTAGACCCCG 


14^351 


CTCGGTGCAC 
GAGCCACGTG 


ATGCTTTACA 
TACGAAATGT 


TGTGTTTAGT 
AC AC AAAT C A 


CGAGGTTAAA 
GCTCCAATTT 


AAACGTCTAG 
TTTGCAGATC 


44 01 


GCCCCCCGAA 
CGGGGGGCTT 


CCACGGGGAC 
GGTGCCCCTG 


GTGGTTTTCC 
CACCAAAAGG 


TTTGAAAAAC 
AAACTTTTTG 


ACGATGATAA 
TGCTACTATT 


4451 


TACCATGAAA 
ATGGTACTTT 


AAGCCTGAAC 
TTCGGACTTG 


TCACCGCGAC 
AGTGGCGCTG 


GTCTGTCGAG 
CAGACAGCTC 


AAGTTTCTGA 
TTCAAAGACT 


4501 


TCGAAAAGTT 
AGCTTTTCAA 


CGACAGCGTC 
GCTGTCGCAG 


TCCGACCTGA 
AGGCTGGACT 


TGCAGCTCTC 
ACGTCGAGAG 


GGAGGGCGAA 
CCTCCCGCTT 


4551 


GAATCTCGTG 
CTTAGAGCAC 


CTTTCAGCTT 
GAAAGTCGAA 


CGATGTAGGA 
GCTACATCCT 


GGGCGTGGAT 
CCCGCACCTA 


ATGTCCTGCG 
TACAGGACGC 


4601 


GGTAAATAGC 
CCATTTATCG 


TGCGCCGATG 
ACGCGGCTAC 


GTTTCTACAA 
CAAAGATGTT 


AGATCGTTAT 
TCTAGCAATA 


GTTTATCGGC 
CAAATAGCCG 


4651 


ACTTTGCATC 
TGAAACGTAG 


GGCCGCGCTC 
CCGGCGCGAG 


CCGATTCCGG 
GGCTAAGGCC 


AAGTGCTTGA 
TTCACGAACT 


CATTGGGGAA 
GTAACCCCTT 


4701 


TTTAGCGAGA 
AAATCGCTCT 


GCCTGACCTA 
CGGACTGGAT 


TTGCATCTCC 
AACGTAGAGG 


CGCCGTGCAC 
GCGGCACGTG 


AGGGTGTCAC 
TCCCACAGTG 
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47 51 GTTGCAAGAC CTGCCTGAAA CCGAACTGCC CGCTGTTCTG CAGCCGGTCG 
CAACGTTCTG GACGGACTTT GGCTTGACGG GCGACAAGAC GTCGGCCAGC 



4801 CGGAGGCCAT GGATGCGATC GCTGCGGCCG ATCTTAGCCA GACGAGCGGG 
GCCTCCGGTA CCTACGCTAG CGACGCCGGC TAGAATCGGT CTGCTCGCCC 



4 851 TTCGGCCCAT TCGGACCGCA AGGAATCGGT CAATACACTA CATGGCGTGA 
AAGCCGGGTA AGCCTGGCGT TCCTTAGCCA GTTATGTGAT GTACCGCACT 



4901 TTTCATATGC GCGATTGCTG ATCCCCATGT GTATCACTGG CAAACTGTGA 
AAAGTATACG CGCTAACGAC TAGGGGTACA CATAGTGACC GTTTGACACT 



4 951 TGGACGACAC CGTCAGTGCG TCCGTCGCGC AGGCTCTCGA TGAGCTGATG 
ACCTGCTGTG GCAGTCACGC AGGCAGCGCG TCCGAGAGCT ACTCGACTAC 



5001 CTTTGGGCCG AGGACTGCCC CGAAGTCCGG CACCTCGTGC ACGCGGATTT 
GAAACCCGGC TCCTGACGGG GCTTCAGGCC GTGGAGCACG TGCGCCTAAA 



5051 CGGCTCCAAC AATGTCCTGA CGGACAATGG CCGCATAACA GCGGTCATTG 
GCCGAGGTTG TTACAGGACT GCCTGTTACC GGCGTATTGT CGCCAGTAAC 



|J 01 ACTGGAGCGA GGCGATGTTC GGGGATTCCC AATACGAGGT CGCCAACATC 
^ TGACCTCGCT CCGCTACAAG CCCCTAAGGG TTATGCTCCA GCGGTTGTAG 



5;±51 TTCTTCTGGA GGCCGTGGTT GGCTTGTATG GAGCAGCAGA CGCGCTACTT 
£ AAGAAGACCT CCGGCACCAA CCGAACATAC CTCGTCGTCT GCGCGATGAA 



£?01 CGAGCGGAGG CATCCGGAGC TTGCAGGATC GCCGCGGCTC CGGGCGTATA 
J" GCTCGCCTCC GTAGGCCTCG AACGTCCTAG CGGCGCCGAG GCCCGCATAT 



5?51 TGCTCCGCAT TGGTCTTGAC CAACTCTATC AGAGCTTGGT TGACGGCAAT 
ACGAGGCGTA ACCAGAACTG GTTGAGATAG TCTCGAACCA ACTGCCGTTA 



5301 TTCGATGATG CAGCTTGGGC GCAGGGTCGA TGCGACGCAA TCGTCCGATC 
O AAGCTACTAC GTCGAACCCG CGTCCCAGCT ACGCTGCGTT AGCAGGCTAG 



5351 CGGAGCCGGG ACTGTCGGGC GTACACAAAT CGCCCGCAGA AGCGCGGCCG 
GCCTCGGCCC TGACAGCCCG CATGTGTTTA GCGGGCGTCT TCGCGCCGGC 



5401 TCTGGACCGA TGGCTGTGTA GAAGTACTCG CCGATAGTGG AAACCGACGC 
AGACCTGGCT ACCGACACAT CTTCATGAGC GGCTATCACC TTTGGCTGCG 



5451 CCCAGCACTC GTCCGAGGGC AAAGGAATAG AGTAGATGCC GACCGGGATC 
GGGTCGTGAG CAGGCTCCCG TTTCCTTATC TCATCTACGG CTGGCCCTAG 



5501 TAT CG AT AAA ATAAAAGATT TTATTTAGTC TCCAGAAAAA GGGGGGAATG 
ATAGCTATTT TATTTTCTAA AATAAATCAG AGGTCTTTTT CCCCCCTTAC 



5551 AAAGACCCCA CCTGTAGGTT TGGCAAGCTA GCTTAAGTAA CGCCATTTTG 
TTTCTGGGGT GGACATCCAA ACCGTTCGAT CGAATTCATT GCGGTAAAAC 



5601 CAAGGCATGG AAAAATACAT AACTGAGAAT AGAGAAGTTC AGATCAAGGT 
GTTCCGTACC TTTTTATGTA TTGACTCTTA TCTCTTCAAG TCTAGTTCCA 



5651 CAGGAACAGA TGGAACAGCT GAATATGGGC CAAACAGGAT ATCTGTGGTA 
GTCCTTGTCT ACCTTGTCGA CTTATACCCG GTTTGTCCTA TAGACACCAT 
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5701 AGCAGTTCCT GCCCCGGCTC AGGGCCAAGA ACAGATGGAA CAGCTGAATA 
TCGTCAAGGA CGGGGCCGAG TCCCGGTTCT TGTCTACCTT GTCGACTTAT 



5751 TGGGCCAAAC AGGATATCTG TGGTAAGCAG TTCCTGCCCC GGCTCAGGGC 
ACCCGGTTTG TCCTATAGAC ACCATTCGTC AAGGACGGGG CCGAGTCCCG 



5801 CAAGAACAGA TGGTCCCCAG ATGCGGTCCA GCCCTCAGCA GTTTCTAGAG 
GTTCTTGTCT ACCAGGGGTC TACGCCAGGT CGGGAGTCGT CAAAGATCTC 



5851 AACCATCAGA TGTTTCCAGG GTGCCCCAAG GACCTGAAAT GACCCTGTGC 
TTGGTAGTCT ACAAAGGTCC CACGGGGTTC CTGGACTTTA CTGGGACACG 



5901 CTTATTTGAA CTAACCAATC AGTTCGCTTC TCGCTTCTGT TCGCGCGCTT 
GAATAAACTT GATTGGTTAG TCAAGCGAAG AGCGAAGACA AGCGCGCGAA 



5 951 CTGCTCCCCG AGCTCAATAA AAGAGCCCAC AACCCCTCAC TCGGGGCGCC 
GACGAGGGGC TCGAGTTATT TTCTCGGGTG TTGGGGAGTG AGCCCCGCGG 



6001 AGTCCTCCGA TTGACTGAGT CGCCCGGGTA CCCGTGTATC CAATAAACCC 
TCAGGAGGCT AACTGACTCA GCGGGCCCAT GGGCACATAG GTTATTTGGG 



©51 TCTTGCAGTT GCATCCGACT TGTGGTCTCG CTGTTCCTTG GGAGGGTCTC 
m AGAACGTCAA CGTAGGCTGA ACACCAGAGC GACAAGGAAC CCTCCCAGAG 



J&01 CTCTGAGTGA TTGACTACCC GTCAGCGGGG GTCTTTCATT CATGCAGCAT 
P GAGACTCACT AACTGATGGG CAGTCGCCCC CAGAAAGTAA GTACGTCGTA 



Msi GTATCAAAAT TAATTTGGTT TTTTTTCTTA AGTATTTACA TTAAATGGCC 
% * : * CATAGTTTTA ATTAAACCAA AAAAAAGAAT TCATAAATGT AATTTACCGG 



hk. 01 ATAGTTGCAT TAATGAATCG GCCAACGCGC GGGGAGAGGC GGTTTGCGTA 
TATCAACGTA ATTACTTAGC CGGTTGCGCG CCCCTCTCCG CCAAACGCAT 



TTGGCGCTCT TCCGCTTCCT CGCT CACTGA CTCGCTGCGC TCGGTCGTTC 
O AACCGCGAGA AGGCGAAGGA GCGAGTGACT GAGCGACGCG AGCCAGCAAG 



6301 GGCTGCGGCG AGCGGTATCA GCTCACTCAA AGGCGGTAAT ACGGTTATCC 
CCGACGCCGC TCGCCATAGT CGAGTGAGTT TCCGCCATTA TGCCAATAGG 



6351 ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 
TGTCTTAGTC CCCTATTGCG TCCTTTCTTG TACACTCGTT TTCCGGTCGT 



64 01 AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC 
TTTCCGGTCC TTGGCATTTT TCCGGCGCAA CGACCGCAAA AAGGTATCCG 



6451 TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG TCAGAGGTGG 
AGGCGGGGGG ACTGCTCGTA GTGTTTTTAG CTGCGAGTTC AGTCTCCACC 



6501 CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 
GCTTTGGGCT GTCCTGATAT TTCTATGGTC CGCAAAGGGG GACCTTCGAG 



6551 CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTGTCCG 
GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATGGCCT ATGGACAGGC 



6 601 CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG 
GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG TGCGACATCC 
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6651 TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 
ATAGAGTCAA GCCACATCCA GCAAGCGAGG TTCGACCCGA CACACGTGCT 



6701 


ACCCCCCGTT 
TGGGGGGCAA 


CAGCCCGACC 
GTCGGGCTGG 


GCTGCGCCTT 
CGACGCGGAA 


ATCCGGTAAC 
TAGGCCATTG 


TATCGTCTTG 
ATAGCAGAAC 


6751 


AGTCCAACCC 
TCAGGTTGGG 


GGTAAGACAC 
CCATTCTGTG 


GACTTATCGC 
CTGAATAGCG 


CACTGGCAGC 
GTGACCGTCG 


AGCCACTGGT 
TCGGTGACCA 


6801 


AACAGGATTA 
TTGTCCTAAT 


GCAGAGCGAG 
CGTCTCGCTC 


GTATGTAGGC 
CATACATCCG 


GGTGCTACAG 
CCACGATGTC 


AGTTCTTGAA 
TCAAGAACTT 


6851 


GTGGTGGCCT 
CACCACCGGA 


AACTACGGCT 
TTGATGCCGA 


ACACTAGAAG 
TGTGATCTTC 


AACAGTATTT 
TTGTCATAAA 


GGTATCTGCG 
CCATAGACGC 


6901 


CTCTGCTGAA 
GAGACGACTT 


GCCAGTTACC 
CGGTCAATGG 


TTCGGAAAAA 
AAGCCTTTTT 


GAGTTGGTAG 
CTCAACCATC 


CTCTTGATCC 
GAGAACTAGG 


6951 


GGCAAACAAA 
CCGTTTGTTT 


CCACCGCTGG 
GGTGGCGACC 


TAGCGGTGGT 
ATCGCCACCA 


TTTTTTGTTT 
AAAAAACAAA 


GCAAGCAGCA 
CGTTCGTCGT 


01)01 


GATTACGCGC 
CTAATGCGCG 


AGAAAAAAAG 
TCTTTTTTTC 


GATCTCAAGA 
CTAGAGTTCT 


AGATCCTTTG 
TCTAGGAAAC 


ATCTTTTCTA 
TAGAAAAGAT 




CGGGGTCTGA 
GCCCCAGACT 


CGCTCAGTGG 
GCGAGTCACC 


AACGAAAACT 
TTGCTTTTGA 


CACGTTAAGG 
GTGCAATTCC 


GATTTTGGTC 
CTAAAACCAG 


Tioi 


ATGAGATTAT 
TACTCTAATA 


CAAAAAGGAT 
GTTTTTCCTA 


CTTCACCTAG 
GAAGTGGATC 


ATCCTTTTGC 
TAGGAAAACG 


GGCCGCAAAT 
CCGGCGTTTA 


"#L51 


CAATCTAAAG 
GTTAGATTTC 


TAT AT AT GAG 
AT AT AT ACT C 


TAAACTTGGT 
ATTTGAACCA 


CTGACAGTTA 
GACTGTCAAT 


CCAATGCTTA 
GGTTACGAAT 


13201 


ATCAGTGAGG 
TAGTCACTCC 


CACCTATCTC 
GTGGATAGAG 


AGCGATCTGT 
TCGCTAGACA 


CTATTTCGTT 
GATAAAGCAA 


CATCCATAGT 
GTAGGTATCA 


7251 


TGCCTGACTC 
ACGGACT GAG 


CCCGTCGTGT 
GGGCAGCACA 


AGATAACTAC 
TCTATTGATG 


GATACGGGAG 
CTATGCCCTC 


GGCTTACCAT 
CCGAATGGTA 


7301 


CTGGCCCCAG 
GACCGGGGTC 


TGCTGCAATG 
ACGACGTTAC 


ATACCGCGAG 
TATGGCGCTC 


ACCCACGCTC 
TGGGTGCGAG 


ACCGGCTCCA 
TGGCCGAGGT 



7 351 GATTTATCAG CAATAAACCA GCCAGCCGGA AGGGCCGAGC GCAGAAGTGG 
CTAAATAGTC GTTATTTGGT CGGTCGGCCT TCCCGGCTCG CGTCTTCACC 



7401 TCCTGCAACT TTATCCGCCT CCATCCAGTC TATTAATTGT TGCCGGGAAG 
AGGACGTTGA AATAGGCGGA GGTAGGTCAG ATAATTAACA ACGGCCCTTC 



74 51 CTAGAGTAAG TAGTTCGCCA GTTAATAGTT TGCGCAACGT TGTTGCCATT 
GATCTCATTC ATCAAGCGGT CAATTATCAA ACGCGTTGCA ACAACGGTAA 



75 01 GCTACAGGCA TCGTGGTGTC ACGCTCGTCG TTTGGTATGG CTTCATTCAG 
CGATGTCCGT AGCACCACAG TGCGAGCAGC AAACCATACC GAAGTAAGTC 



7551 CTCCGGTTCC CAACGATCAA GGCGAGTTAC ATGATCCCCC ATGTTGTGCA 
GAGGCCAAGG GTTGCTAGTT CCGCTCAATG TACTAGGGGG TACAACACGT 
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7601 


AAAAAGCGGT TAGCTCCTTC 


GGTCCTCCGA 


TCGTTGTCAG 


AAGTAAGTTG 




TTTTTCGCCA ATCGAGGAAG 


/T""7\ r^ciy pppt 1 


app a apaptp 


TTPATTCAAC 


7651 


GCCGCAGTGT TATCACTCAT 


GGTTATGGCA 


GCACTGCATA 


ATTCTCTTAC 




CGGCGTCACA ATAGLGAbxA 


r*c t\ a fp 7\ pppt 


PPTPAPPT AT 


TAAGAGAATG 


7701 


TGTCATGCCA TCCGTAAGAT 


GCTTTTCTGT 


GACTGGTGAG 


TACTCAACCA 




ACAGTACGGT AGGCATTCTA 




^ ± ori^^jri.^ J- v-- 




7751 


AGTCATTCTG AGAATAGTGT 


ATGCGGCGAC 


CGAGTTGCTC 


TTGCCCGGCG 




TCAGTAAGAC TCTTATCACA 




PPTPAAPPAP 


AACGGGCCGC 


7801 


TCAATACGGG ATAATACCGC 


GCCACATAGC 


AGAACTTTAA 


AAGTGCTCAT 




AGTTATGCCC TATTATGGCG 




TTTTTB 7\ flTT 
Ibl i ^£A±±t\ 1 1 


TTPAPPAGTA 


7851 


CATTGGAAAA CGTTCTTCGG 


GGCGAAAACT 


CTCAAGGATC 


TTACCGCTGT 




GTAACCTTTT GCAAGAAGCC 


1 i ion 


PAPTTPPTAP 


AATGGCGACA 


Bf 01 


TGAGATCCAG TTCGATGTAA 


CCCACTCGTG 


CACCCAACTG 


ATCTTCAGCA 




ACTCTAGGTC AAGCTACATT 


ppptpappap 


PTPPPTTPAP 


TAGAAGTCGT 


rtf 5i 


TCTTTTACTT TCACCAGCGT 


TTCTGGGTGA 


GCAAAAACAG 


GAAGGCAAAA 


AGAAAATGAA AGTGGTCGCA 


AALi/iOL^/iU. 1 


PPTTTTTPTP 


PTTCCGTTTT 

J. J. Vw* V-' X X J. X 


%)01 


TGCCGCAAAA AAGGGAATAA 


GGGCGACACG 


GAAATGTTGA 


AT ACT CAT AC 




ACGGCGTTTT TTCCCTTATT 




PTTTAPAAPT 


TATPAGTATG 


*8051 


TCTTCCTTTT TCAATATTAT 


TGAAGCATTT 


ATCAGGGTTA 


TTGTCTCATG 


AGAACjOAAAA /ibjl 1 jft. ± 1 .M 


APTTPGTAAA 


TAGTCCCAAT 


AACAGAGTAC 


rgioi 


AGCGGATACA TATTTGAATG 


TATTTAGAAA 


AATAAACAAA 


TAGGGGTTCC 




TCGCCTATGT AT AAACT T AC 


ATAAATCTTT 


TTATTTGTTT 


ATCCCCAAGG 


Jlisi 


GCGCACATTT C 










CGCGTGTAAA G 
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